magdb package available from github contains all necessary tools for using the CDESnowball toolkit (or plain CDE) to generate MongoDB database records.
The toolkit requires MongoDB to be installed and running using pymongo. This can be installed using pip.
pip3 install pymongo
For further details see the MongoDB documentation
Creating a database of phase transition CDE records with the magdb toolkit is very simple, just provide the system with a directory of files to process:
>>> from magdb import MagnetismDatabase # Path to the corpus >>> corpus = 'path/to/corpus' # Create an arbitrary database name >>> db_name = "db_test" # create the database, establish a mongodb connection >>> db = MagnetismDatabase(db_name) # run on the corpus >>> db.from_files(corpus) # read resulting database >>> entries = db.database.posts >>> for record in entries.find(): >>> print(record)
This will print a JSON-style document entry containing:
Records classes can be modified and extended to add new fields as required.