mmcif_pdbx¶
This is yet another PyPI package for http://mmcif.wwpdb.org/pdbx-mmcif-home-page.html. It emphasizes a simple and pure Python interface to basic mmCIF functionality.
The canonical mmCIF Python package can be found at https://github.com/rcsb/py-mmcif. It is full-featured and includes C/C++ code to accelerate I/O functions.
This package provides the module pdbx
.
More information about the pdbx
module can be found in the API reference section.
Origin of this software¶
All of the code in this repository is based on http://mmcif.wwpdb.org/. Specifically, this code is directly derived from http://mmcif.wwpdb.org/docs/sw-examples/python/src/pdbx.tar.gz linked from http://mmcif.wwpdb.org/docs/sw-examples/python/html/.
See http://mmcif.wwpdb.org/docs/sw-examples/python/html/ for more information about this package, including examples.
Versions¶
Versions 0.* maintain API compatibility with the original code. Subsequent versions break that compatibility, primarily by renaming methods in compliance with PEP8.
Installation¶
This python package can be installed via setuptools, pip install .
, or via PyPI.
Testing¶
The software can be tested with pytest by running:
python -m pytest
from the top-level directory.
Indices and tables¶
Contents¶
API reference¶
Note
The API is still changing. We use semantic versioning and our Change log to document changes between versions.
Basic functions¶
PDBx/mmCIF Python dictionary resources.
All of the code in this repository is original based on http://mmcif.wwpdb.org/. Specifically, this code is directly derived from the pdbx code linked from PDBx Python Parser Examples and Tutorial.
See PDBx Python Parser Examples and Tutorial for more information about this package, including examples.
-
pdbx.
dump
(datacontainers, fp)[source]¶ Write a list of objects to a CIF file.
Parameters: - datacontainers (list) – a list of
DataContainer
objects - fp (file) – a file object ready for writing
- datacontainers (list) – a list of
-
pdbx.
dumps
(datacontainers) → str[source]¶ Serialize a list of objects to a CIF-formatted string.
Parameters: datacontainers (list) – list of DataContainer
objectsReturns: CIF-formatted string
-
pdbx.
load
(fp) → list[source]¶ Parse a CIF file.
Parameters: fp (file) – file object ready for reading Returns: a list of DataContainer
objects
-
pdbx.
loads
(s) → list[source]¶ Parse a CIF string.
Parameters: s (str) – string with CIF data Returns: a list of DataContainer
objects
Input-output classes¶
reader
¶
PDBx/mmCIF dictionary and data file parser.
Note
Acknowledgements:
The tokenizer used in this module is modeled after the clever parser design used in the PyMMLIB package.
PyMMLib Development Group:
Authors: Ethan Merritt: merritt@u.washington.edu, Jay Painter: jay.painter@gmail.com
-
class
pdbx.reader.
PdbxReader
(input_file)[source]¶ PDBx reader for data files and dictionaries.
-
read
(container_list)[source]¶ Appends to the input list of definition and data containers.
Parameters: container_list (list) – list of ContainerBase
containers to append to.
-
writer
¶
Classes for writing data and dictionary containers in PDBx/mmCIF format.
-
class
pdbx.writer.
PdbxWriter
(output_file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Write PDBx data files or dictionaries. Use the input container or container list.
-
set_row_partition
(num_rows)[source]¶ Maximum number of rows checked for value length and format.
Parameters: num_rows (int) – maximum number of rows
-
write
(container_list)[source]¶ Write out a list of containers.
Parameters: container_list (list) – list of ContainerBase
objects to write.
-
write_container
(container)[source]¶ Write out information for an individual container.
Parameters: container ( ContainerBase
) – container to write
-
Data structure classes¶
containers
¶
A collection of container classes supporting the PDBx/mmCIF storage model.
A base container class is defined which supports common features of data and definition containers. PDBx data files are organized in sections called data blocks which are mapped to data containers. PDBx dictionaries contain definition sections and data sections which are mapped to definition and data containers respectively.
Data in both PDBx data files and dictionaries are organized in data categories. In the PDBx syntax individual items or data identified by labels of the form ‘_categoryName.attribute_name’. The terms category and attribute in PDBx jargon are analogous table and column in relational data model, or class and attribute in an object oriented data model.
The DataCategory class provides base storage container for instance data and definition meta data.
-
class
pdbx.containers.
CifName
[source]¶ Class of utilities for CIF-style data names.
-
class
pdbx.containers.
ContainerBase
(name)[source]¶ Container base class for data and definition objects.
-
append
(obj)[source]¶ Add the input object to the current object catalog. An existing object of the same name will be overwritten.
Parameters: obj ( DataCategory
) – input object to catalog
-
exists
(name) → bool[source]¶ Determine if object name exists in object catalog.
Parameters: name (str) – object name Returns: whether object exists in object catalog
-
get_object
(name)[source]¶ Get object from object catalog.
Parameters: name (str) – object name Returns: object or None Return type: DataCategory
-
get_object_name_list
() → list[source]¶ Get list of object names.
Returns: list of DataCategory
objects
-
name
¶ Get container name.
Returns: container name
-
print_it
(fh=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, type_='brief')[source]¶ Dump information about container to specified file object.
Parameters: - fh (file) – file object for writing
- type (str) – type of summary (“brief” makes it short)
-
remove
(current_name) → bool[source]¶ Remove object by name.
Parameters: current_name (str) – name of object to remove Returns: True on success or False otherwise.
-
rename
(current_name, new_name) → bool[source]¶ Change the name of an object in place.
Parameters: - current_name (str) – old name for object
- new_name (str) – new name for object
Returns: indicator of whether renaming was successful
-
replace
(obj)[source]¶ Replace an existing object with the input object.
Parameters: obj ( DataCategory
) – input object to catalog
-
-
class
pdbx.containers.
DataCategory
(name, attribute_name_list=None, row_list=None)[source]¶ Methods for creating, accessing, formatting PDBx cif data categories.
-
append_attribute
(attribute_name)[source]¶ Add attribute to container.
Parameters: attribute_name (str) – name of attribute to add
-
append_attribute_extend_rows
(attribute_name)[source]¶ Append attribute and extend rows.
Parameters: attribute_name (str) – name of attribute to add
-
attribute_count
¶ Get number of attributes.
-
attribute_list
¶ Get list of attributes.
-
attribute_list_with_order
¶ Get list of attributes in order.
-
current_attribute
¶ Get current attribute.
-
current_row_index
¶ Get current row index.
-
dump_it
(file_=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Dump contents of container.
Parameters: file (file) – file object ready for writing
-
get_attribute_index
(attribute_name) → int[source]¶ Get index of given attribute.
Parameters: attribute_name (str) – name of attribute Returns: index of attribute Raises: IndexError – if attribute not found
-
get_format_type_list
(steps=1) → str[source]¶ Get a formatted type list.
Parameters: steps (int) – step size for iterating through rows Returns: formatted type list
-
get_format_type_list_x
¶ Alternate version of format type list.
-
get_full_row
(index) → list[source]¶ Return a full row based on the length of the the attribute list.
Parameters: index (int) – index of row to retrieve Returns: row
-
get_max_attribute_list_length
(steps=1) → int[source]¶ Get maximum length of attribute value list.
Parameters: steps (int) – step size for iterating through rows Returns: attribute value list max length
-
get_row
(index) → list[source]¶ Get specified row.
Parameters: index (int) – row index Returns: specified row or empty array if row not found.
-
get_value
(attribute_name=None, row_index=None)[source]¶ Get value for specified attribute and row.
Parameters: - attribute_name (str) – attribute name
- row_index (int) – row index
Returns: attribute value
Raises: IndexError – if attribute not found
-
get_value_formatted
(attribute_name=None, row_index=None) → str[source]¶ Get formatted version of value.
Parameters: - attribute_name (str) – attribute name
- row_index (int) – row index
Returns: formatted value
-
get_value_formatted_by_index
(attribute_index, row_index) → str[source]¶ Get value formatted by index.
Parameters: - attribute_name (str) – attribute name
- row_index (int) – row index
Returns: formatted value
-
invoke_attribute_method
(attribute_name, method)[source]¶ Invoke method of current attribute.
Parameters: - attribute_name (str) – attribute name
- method (str) – name of attribute method
-
invoke_category_method
(method)[source]¶ Invoke method of current category.
Parameters: method (str) – name of method
-
item_name_list
¶ List of attribute names as fully qualified item names.
-
max_attribute_list_length
¶ Get maximum attribute list length.
-
name
¶ Get container name.
-
print_it
(file_=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶ Print container information.
Parameters: file (file) – file object ready for writing
-
remove_row
(index) → bool[source]¶ Remove specified row.
Parameters: index (int) – index of row to remove Returns: True if successful, False otherwise
-
rename_attribute
(current_attribute_name, new_attribute_name) → bool[source]¶ Change the name of an attribute in place.
Parameters: - current_attribute_name (str) – current attribute name
- new_attribute_name (str) – new attribute name
Returns: flag indicating renaming success
-
replace_substring
(old_value, new_value, attribute_name) → bool[source]¶ Replace substring of value of given attribute.
Parameters: - old_value – old attribute value
- new_value – new attribute value
- attribute_name (str) – name of attribute to replace
Returns: Boolean flag indicating success.
-
replace_value
(old_value, new_value, attribute_name) → int[source]¶ Replace the value of the specified attribute.
Parameters: - old_value – old attribute value
- new_value – new attribute value
- attribute_name (str) – name of attribute to replace
Returns: number of replacements
-
row_count
¶ Get number of rows.
-
row_list
¶ Get list of rows.
-
-
class
pdbx.containers.
DataCategoryBase
(name, attribute_name_list=None, row_list=None)[source]¶ Base object definition for a data category.
-
get
() → tuple[source]¶ Get name, attribute name list, and row list.
Returns: tuple of (name, attribute name list, and row list)
-
-
class
pdbx.containers.
DataContainer
(name)[source]¶ Container class for DataCategory objects.
-
class
pdbx.containers.
DefinitionContainer
(name)[source]¶ Container for definitions.
-
is_attribute
() → bool[source]¶ Determine if container contains item objects.
Returns: indicator of whether item objects are in container
-
Change log¶
v1.1.2 (01-Aug-2020)¶
v1.1.0 (11-Jul-2020)¶
v1.0.0 (07-Jul-2020)¶
v0.0.1 (05-Jul-2020)¶
Initial release.