Setting-up type transforms pipelines
Collecting items from doc_id standard_value standard_type standard_relation pchembl \
0 3476 44.4 IC50 = 7.35
1 6512 180.0 IC50 = 6.75
2 6512 120.0 IC50 = 6.92
3 3476 311.0 IC50 = 6.51
4 3476 6.1 IC50 = 8.21
... ... ... ... ... ...
2124 109537 1023.0 IC50 = 5.99
2125 109537 3026.0 IC50 = 5.52
2126 109537 8451.0 IC50 = 5.07
2127 109537 5735.0 IC50 = 5.24
2128 109537 3.4 IC50 = 8.47
molregno \
0 192068
1 203908
2 204329
3 192044
4 191486
... ...
2124 2333070
2125 2319885
2126 2333406
2127 2325502
2128 709600
canonical_smiles \
0 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc3ccccc3n2)NS(=O)(=O)Cc2ccccc2)C1O
1 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1-n1ccccc1=O
2 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
3 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CNC(=O)c2cnccn2)NS(=O)(=O)Cc2ccccc2)C1O
4 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc(O)nc2)NS(=O)(=O)Cc2ccccc2)C1O
... ...
2124 CC1(C)OCC([C@]2(C)C=C3CC[C@@H]4C(C)(C)[C@H](O)CC[C@@]4(C)[C@@H]3CC2)O1
2125 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)([C@@H](O)CO)C=C3CC[C@@H]2C1(C)C
2126 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(CO)C=C3CC[C@@H]2C1(C)C
2127 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(O)CC=C3CC[C@@H]2C1(C)C
2128 CN1CCc2nc(C(=O)N[C@@H]3C[C@@H](C(=O)N(C)C)CC[C@@H]3NC(=O)C(=O)Nc3ccc(Cl)cn3)sc2C1
chembl_id target_dictionary target_chembl_id l1 l2 \
0 CHEMBL117716 194 CHEMBL244 Enzyme Protease
1 CHEMBL337921 194 CHEMBL244 Enzyme Protease
2 CHEMBL340500 194 CHEMBL244 Enzyme Protease
3 CHEMBL117721 194 CHEMBL244 Enzyme Protease
4 CHEMBL331807 194 CHEMBL244 Enzyme Protease
... ... ... ... ... ...
2124 CHEMBL4293622 194 CHEMBL244 Enzyme Protease
2125 CHEMBL4280434 194 CHEMBL244 Enzyme Protease
2126 CHEMBL4293958 194 CHEMBL244 Enzyme Protease
2127 CHEMBL4286054 194 CHEMBL244 Enzyme Protease
2128 CHEMBL1269025 194 CHEMBL244 Enzyme Protease
l3 confidence_score act \
0 Serine protease 8 Active
1 Serine protease 8 Active
2 Serine protease 8 Active
3 Serine protease 8 Active
4 Serine protease 8 Active
... ... ... ...
2124 Serine protease 9 Inactive
2125 Serine protease 9 Inactive
2126 Serine protease 9 Inactive
2127 Serine protease 9 Inactive
2128 Serine protease 9 Active
processed_smiles \
0 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc3ccccc3n2)NS(=O)(=O)Cc2ccccc2)C1O
1 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1-n1ccccc1=O
2 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
3 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CNC(=O)c2cnccn2)NS(=O)(=O)Cc2ccccc2)C1O
4 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc(O)nc2)NS(=O)(=O)Cc2ccccc2)C1O
... ...
2124 CC1(C)OCC([C@]2(C)C=C3CC[C@@H]4C(C)(C)[C@H](O)CC[C@@]4(C)[C@@H]3CC2)O1
2125 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)([C@@H](O)CO)C=C3CC[C@@H]2C1(C)C
2126 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(CO)C=C3CC[C@@H]2C1(C)C
2127 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(O)CC=C3CC[C@@H]2C1(C)C
2128 CN1CCc2nc(C(=O)N[C@@H]3C[C@@H](C(=O)N(C)C)CC[C@@H]3NC(=O)C(=O)Nc3ccc(Cl)cn3)sc2C1
filename
0 mols_imgs/mol_0.png
1 mols_imgs/mol_1.png
2 mols_imgs/mol_2.png
3 mols_imgs/mol_3.png
4 mols_imgs/mol_4.png
... ...
2124 mols_imgs/mol_2124.png
2125 mols_imgs/mol_2125.png
2126 mols_imgs/mol_2126.png
2127 mols_imgs/mol_2127.png
2128 mols_imgs/mol_2128.png
[2129 rows x 17 columns]
Found 2129 items
2 datasets of sizes 1703,426
Setting up Pipeline: ColReader -- {'cols': 'filename', 'pref': '', 'suff': '', 'label_delim': None} -> PILBase.create
Setting up Pipeline: ColReader -- {'cols': 'act', 'pref': '', 'suff': '', 'label_delim': None} -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
Building one sample
Pipeline: ColReader -- {'cols': 'filename', 'pref': '', 'suff': '', 'label_delim': None} -> PILBase.create
starting from
doc_id 6512
standard_value 120
standard_type IC50
standard_relation =
pchembl 6.92
molregno 204329
canonical_smiles Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
chembl_id CHEMBL340500
target_dictionary 194
target_chembl_id CHEMBL244
l1 Enzyme
l2 Protease
l3 Serine protease
confidence_score 8
act Active
processed_smiles Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
filename mols_imgs/mol_2.png
Name: 2, dtype: object
applying ColReader -- {'cols': 'filename', 'pref': '', 'suff': '', 'label_delim': None} gives
mols_imgs/mol_2.png
applying PILBase.create gives
PILImage mode=RGB size=400x400
Pipeline: ColReader -- {'cols': 'act', 'pref': '', 'suff': '', 'label_delim': None} -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
starting from
doc_id 6512
standard_value 120
standard_type IC50
standard_relation =
pchembl 6.92
molregno 204329
canonical_smiles Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
chembl_id CHEMBL340500
target_dictionary 194
target_chembl_id CHEMBL244
l1 Enzyme
l2 Protease
l3 Serine protease
confidence_score 8
act Active
processed_smiles Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
filename mols_imgs/mol_2.png
Name: 2, dtype: object
applying ColReader -- {'cols': 'act', 'pref': '', 'suff': '', 'label_delim': None} gives
Active
applying Categorize -- {'vocab': None, 'sort': True, 'add_na': False} gives
TensorCategory(0)
Final sample: (PILImage mode=RGB size=400x400, TensorCategory(0))
Collecting items from doc_id standard_value standard_type standard_relation pchembl \
0 3476 44.4 IC50 = 7.35
1 6512 180.0 IC50 = 6.75
2 6512 120.0 IC50 = 6.92
3 3476 311.0 IC50 = 6.51
4 3476 6.1 IC50 = 8.21
... ... ... ... ... ...
2124 109537 1023.0 IC50 = 5.99
2125 109537 3026.0 IC50 = 5.52
2126 109537 8451.0 IC50 = 5.07
2127 109537 5735.0 IC50 = 5.24
2128 109537 3.4 IC50 = 8.47
molregno \
0 192068
1 203908
2 204329
3 192044
4 191486
... ...
2124 2333070
2125 2319885
2126 2333406
2127 2325502
2128 709600
canonical_smiles \
0 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc3ccccc3n2)NS(=O)(=O)Cc2ccccc2)C1O
1 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1-n1ccccc1=O
2 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
3 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CNC(=O)c2cnccn2)NS(=O)(=O)Cc2ccccc2)C1O
4 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc(O)nc2)NS(=O)(=O)Cc2ccccc2)C1O
... ...
2124 CC1(C)OCC([C@]2(C)C=C3CC[C@@H]4C(C)(C)[C@H](O)CC[C@@]4(C)[C@@H]3CC2)O1
2125 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)([C@@H](O)CO)C=C3CC[C@@H]2C1(C)C
2126 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(CO)C=C3CC[C@@H]2C1(C)C
2127 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(O)CC=C3CC[C@@H]2C1(C)C
2128 CN1CCc2nc(C(=O)N[C@@H]3C[C@@H](C(=O)N(C)C)CC[C@@H]3NC(=O)C(=O)Nc3ccc(Cl)cn3)sc2C1
chembl_id target_dictionary target_chembl_id l1 l2 \
0 CHEMBL117716 194 CHEMBL244 Enzyme Protease
1 CHEMBL337921 194 CHEMBL244 Enzyme Protease
2 CHEMBL340500 194 CHEMBL244 Enzyme Protease
3 CHEMBL117721 194 CHEMBL244 Enzyme Protease
4 CHEMBL331807 194 CHEMBL244 Enzyme Protease
... ... ... ... ... ...
2124 CHEMBL4293622 194 CHEMBL244 Enzyme Protease
2125 CHEMBL4280434 194 CHEMBL244 Enzyme Protease
2126 CHEMBL4293958 194 CHEMBL244 Enzyme Protease
2127 CHEMBL4286054 194 CHEMBL244 Enzyme Protease
2128 CHEMBL1269025 194 CHEMBL244 Enzyme Protease
l3 confidence_score act \
0 Serine protease 8 Active
1 Serine protease 8 Active
2 Serine protease 8 Active
3 Serine protease 8 Active
4 Serine protease 8 Active
... ... ... ...
2124 Serine protease 9 Inactive
2125 Serine protease 9 Inactive
2126 Serine protease 9 Inactive
2127 Serine protease 9 Inactive
2128 Serine protease 9 Active
processed_smiles \
0 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc3ccccc3n2)NS(=O)(=O)Cc2ccccc2)C1O
1 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1-n1ccccc1=O
2 Cc1cc(NC(=O)Cc2ccc3[nH]c(-c4ccc(Cl)s4)nc3c2)ccc1N1CCOCC1=O
3 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CNC(=O)c2cnccn2)NS(=O)(=O)Cc2ccccc2)C1O
4 N=C(N)N1CCC[C@H](NC(=O)CNC(=O)[C@@H](CCNC(=O)c2ccc(O)nc2)NS(=O)(=O)Cc2ccccc2)C1O
... ...
2124 CC1(C)OCC([C@]2(C)C=C3CC[C@@H]4C(C)(C)[C@H](O)CC[C@@]4(C)[C@@H]3CC2)O1
2125 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)([C@@H](O)CO)C=C3CC[C@@H]2C1(C)C
2126 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(CO)C=C3CC[C@@H]2C1(C)C
2127 CC(=O)O[C@@H]1CC[C@@]2(C)[C@@H]3CC[C@](C)(O)CC=C3CC[C@@H]2C1(C)C
2128 CN1CCc2nc(C(=O)N[C@@H]3C[C@@H](C(=O)N(C)C)CC[C@@H]3NC(=O)C(=O)Nc3ccc(Cl)cn3)sc2C1
filename
0 mols_imgs/mol_0.png
1 mols_imgs/mol_1.png
2 mols_imgs/mol_2.png
3 mols_imgs/mol_3.png
4 mols_imgs/mol_4.png
... ...
2124 mols_imgs/mol_2124.png
2125 mols_imgs/mol_2125.png
2126 mols_imgs/mol_2126.png
2127 mols_imgs/mol_2127.png
2128 mols_imgs/mol_2128.png
[2129 rows x 17 columns]
Found 2129 items
2 datasets of sizes 1703,426
Setting up Pipeline: ColReader -- {'cols': 'filename', 'pref': '', 'suff': '', 'label_delim': None} -> PILBase.create
Setting up Pipeline: ColReader -- {'cols': 'act', 'pref': '', 'suff': '', 'label_delim': None} -> Categorize -- {'vocab': None, 'sort': True, 'add_na': False}
Setting up after_item: Pipeline: Resize -- {'size': (256, 256), 'method': 'crop', 'pad_mode': 'reflection', 'resamples': (2, 0), 'p': 1.0} -> ToTensor
Setting up before_batch: Pipeline:
Setting up after_batch: Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} -> Rotate -- {'size': None, 'mode': 'bilinear', 'pad_mode': 'reflection', 'mode_mask': 'nearest', 'align_corners': True, 'p': 1.0}
Building one batch
Applying item_tfms to the first sample:
Pipeline: Resize -- {'size': (256, 256), 'method': 'crop', 'pad_mode': 'reflection', 'resamples': (2, 0), 'p': 1.0} -> ToTensor
starting from
(PILImage mode=RGB size=400x400, TensorCategory(0))
applying Resize -- {'size': (256, 256), 'method': 'crop', 'pad_mode': 'reflection', 'resamples': (2, 0), 'p': 1.0} gives
(PILImage mode=RGB size=256x256, TensorCategory(0))
applying ToTensor gives
(TensorImage of size 3x256x256, TensorCategory(0))
Adding the next 3 samples
No before_batch transform to apply
Collating items in a batch
Applying batch_tfms to the batch built
Pipeline: IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} -> Rotate -- {'size': None, 'mode': 'bilinear', 'pad_mode': 'reflection', 'mode_mask': 'nearest', 'align_corners': True, 'p': 1.0}
starting from
(TensorImage of size 4x3x256x256, TensorCategory([0, 0, 0, 0], device='cuda:0'))
applying IntToFloatTensor -- {'div': 255.0, 'div_mask': 1} gives
(TensorImage of size 4x3x256x256, TensorCategory([0, 0, 0, 0], device='cuda:0'))
applying Rotate -- {'size': None, 'mode': 'bilinear', 'pad_mode': 'reflection', 'mode_mask': 'nearest', 'align_corners': True, 'p': 1.0} gives
(TensorImage of size 4x3x256x256, TensorCategory([0, 0, 0, 0], device='cuda:0'))