Dataset =================================== In EpiLearn, we use **UniversalDataset** to load preprocessed datasets. For customized data, we can simply initialize the UniversalDataset given features, graphs, and states. UniversalDataset -------------------- .. autoclass:: epilearn.data.dataset.UniversalDataset :members: Preprocessed Datasets =================================== We collect epidemic data from various sources including the followings: **Temporal Data** * `Tycho_v1.0.0 `_: Including eight diseases collected across 50 US states and 122 US cities from 1916 to 2009. * `Measles `_: Contains measles infections in England and Wales across 954 urban centers (cities and towns) from 1944 to 1964. **Spatial&Temporal Data** * **Covid_static**: Contains covid infections with static graph. `[1] `_ * **Covid_dynamic**: Contains covid infections with dynamic graph. `[2] `_ `[3] `_ **Dataset Loading** Loading Measle and Tycho Datasets: .. code-block:: python from epilearn.data import UniversalDataset tycho_dataset = UniversalDataset(name='Tycho_v1', root='./tmp/') measle_dataset = UniversalDataset(name='Measles', root='./tmp/') For covid data, we support the Dataset from Johns Hopkings University: .. code-block:: python from epilearn.data import UniversalDataset jhu_dataset = UniversalDataset(name='JHU_covid', root='./tmp/') For other countries, please use 'Covid\_'+'country' to acquire the correspnding covid dataset. Currently, we support countries like China, Brazil, Austria, England, France, Italy, Newzealand, and Spain. .. code-block:: python from epilearn.data import UniversalDataset covid_dataset = UniversalDataset(name='Covid_Brazil', root='./tmp/') Customize Your Own Dataset --------------------------- First, you should form your data as a dictionary with keys of features, graph, dynamic_graph, targets, and states. Here is an example: .. code-block:: python data = torch.load("example.pt") data.keys() .. code-block:: text dict_keys(['features', 'graph', 'dynamic_graph', 'targets', 'states']) .. code-block:: python node_features = data['features'] # [time steps, nodes, channels]: torch.Size([539, 47, 4]) static_graph = torch.Tensor(data['graph']) # [nodes, nodes]: (47, 47) dynamic_graph = data['dynamic_graph'] # [time steps, nodes, nodes]: torch.Size([539, 47, 47]) targets = data['targets'] # [time steps, nodes]: torch.Size([539, 47]) node_status = data['states'] # [time steps, nodes]: torch.Size([539, 47]) Next, you can use your own data to establish a `UniversalDataset` class by passing the correponding parameters due to your needs. Not every parameters are required. You can refer to `UniversalDataset`_ to obtain detailed descriptions and customize your parameters. .. code-block:: python from epilearn.data import UniversalDataset dataset_sample1 = UniversalDataset(x=node_features, states=node_status, # e.g. additional information of each node, e.g. SIR states y=targets, # prediction target graph=static_graph, # adjacency matrix, we also support edge index: edge_index = ... dynamic_graph=dynamic_graph # # adjacency matrix ) dataset_sample2 = UniversalDataset(x=features,y=node_target,graph=graph) For more sample code in a real training process, you can refer to `examples/dataset_customization.ipynb` on the github page.