Skip to main content

Data Management and Analytics

DuraMAT investigates photovoltaic (PV) material degradation and durability through an ambitious data collection and analytics effort. We aggregate data from diverse sources such as device simulation, materials characterization, and time-series PV performance into one place—the DuraMAT Data Hub.

In addition to storing heterogenous data, we also actively perform PV reliability research by mining resources in the DataHub to find new and interesting correlations. It inspires the development of open-source software tools that other PV researchers can freely use in their own research projects.

The software developed will be useful for:

  • Processing or cleaning data
  • Making informative visualizations
  • Building machine learning models.

Also, the software developed will build upon the popular and flexible Python ecosystem using libraries such as NumPy, SciPy, Pandas, Scikit-Learn, and matplotlib.

We're sensitive to the fact that some data contains proprietary or protected information. Therefore, we'll ensure you're comfortable with how we use, display, store, and report your data.

Visit the Data Hub.


DuraMAT Data Hub


Materials Discovery, Selection, and Design Using Software Tools


Lawrence Berkeley National Laboratory


Ong, S.P., et al. (2013) "Python Materials Genomics (pymatgen): A robust, open-source python library for materials analysis." Computational Materials Science. vol. 68, pp. 314–319.

Jones, C.B.; Mart, M.; Carmignani, C.K.; Lavrova, O.; Robinson, C.; Stein, J. S. "Automatic Fault Classification of Photovoltaic Strings Based on an In Situ IV Characterization System and a Gaussian Process Algorithm." IEEE 43rd Photovoltaic Specialists Conference (PVSC). pp. 1708–1713, 2016.


To learn more about this capability area, contact Anubhav Jain.

Flowchart with "GUI" above "Security" above "RDBS" with a two-directional arrow next to "Data Hub" with a two-directional arrow next to "File Archives." "Data Hub" has an arrow labeled "API" pointing to a cylindrical image labeled "Outside Data Registries"; and a two-directional arrow, labeled "API,"  to "Analytics Tools" below. Also below is an image of photovoltaic modules labeled "PV Field Deployments" with an arrow pointing to a cylindrical image labeled "Time Series and Other Databases" with an arrow labeled "API" pointing to "Analytics Tools."