Information Units
Information Units are the foundational components of EMOS that provide access to data, generation capabilities, and prediction models. They are designed with standardized interfaces to ensure consistency and enable easy integration within Features.
Overview
EMOS includes three types of Information Units:
Databases: Access to external materials databases and repositories
Generators: AI-powered tools for generating new materials and structures
Predictors: Machine learning models for property prediction and analysis
Each type follows a common design pattern with:
Standardized base classes
Factory-based instantiation
Consistent method interfaces
Error handling and logging
Design Principles
Standardized Interface
All Information Units implement standard base classes:
# Base classes define common interface
class BaseDatabase:
def __init__(self, database_name, logger=None)
def info(self) -> str
def retrieve(self, inputs: dict) -> str
class BaseGenerator:
def __init__(self, generator_name, logger=None)
def info(self) -> str
def generate(self, inputs: dict) -> str
class BasePredictor:
def __init__(self, predictor_name, logger=None)
def info(self) -> str
def predict(self, inputs: dict) -> str
Factory Pattern
Information Units are registered in factory dictionaries for dynamic instantiation:
# Factory dictionaries enable easy extension
database_factory = {
"icsd": ICSD,
"mp": MP,
# Easy to add new databases
}
generator_factory = {
"mattergen": MatterGen,
"gnome": GNoME,
# Easy to add new generators
}
predictor_factory = {
"m3gnet": M3GNet,
"mattersim": MatterSim,
# Easy to add new predictors
}
Consistent Usage Pattern
All Information Units follow the same usage pattern within Features:
# Standard pattern for using Information Units
active_databases = inputs.get('active_databases', [])
for db_config in active_databases:
db_key = db_config['value']
if db_key in database_factory:
db_instance = database_factory[db_key](db_key, self.logger)
result = db_instance.retrieve(retrieve_inputs)
Adding New Information Units
The modular design makes it easy to add new Information Units:
Implement Base Class: Create new class inheriting from appropriate base
Register in Factory: Add to corresponding factory dictionary
No Changes Required: Existing Features automatically discover new units
Example of adding a new database:
# 1. Implement the database
class NewDatabase(BaseDatabase):
def info(self):
return "NewDatabase: Description of capabilities"
def retrieve(self, inputs):
# Implementation here
return results
# 2. Register in factory
database_factory["newdb"] = NewDatabase
# 3. Automatically available in all Features
This modular approach ensures that EMOS can easily grow and adapt to new tools and databases as they become available.