Skip to content

Commit

Permalink
added reverse complement func
Browse files Browse the repository at this point in the history
  • Loading branch information
lmdu committed Sep 6, 2023
1 parent e7f7b91 commit 575802e
Show file tree
Hide file tree
Showing 5 changed files with 61 additions and 14 deletions.
22 changes: 19 additions & 3 deletions docs/source/api_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,29 @@ pyfastx.version

:rtype: bool

.. py:function:: pyfastx.reverse_complement(seq)
New in pyfastx 2.0.0

get reverse complement sequence of given DNA sequence

:param str seq: DNA sequence

:return: reverse complement sequence

:rtype: str

pyfastx.Fasta
-------------

.. py:class:: pyfastx.Fasta(file_name, uppercase=True, build_index=True, full_index=False, full_name=False, memory_index=False, key_func=None)
.. py:class:: pyfastx.Fasta(file_name, index_file=None, uppercase=True, build_index=True, full_index=False, full_name=False, memory_index=False, key_func=None)
Read and parse fasta files. Fasta can be used as dict or list, you can use index or sequence name to get a sequence object, e.g. ``fasta[0]``, ``fasta['seq1']``

:param str file_name: the file path of input FASTA file

:param str index_file: the index file of FASTA file, default using index file with extension of .fxi in the same directory of FASTA file, New in 2.0.0

:param bool uppercase: always output uppercase sequence, default: ``True``

:param bool build_index: build index for random access to FASTA sequence, default: ``True``. If build_index is False, iteration will return a tuple (name, seq); If build_index is True, iteration will return a sequence object.
Expand Down Expand Up @@ -264,11 +278,13 @@ pyfastx.Fastq

New in ``pyfastx`` 0.4.0

.. py:class:: pyfastx.Fastq(file_name, phred=0, build_index=True, full_index=False)
.. py:class:: pyfastx.Fastq(file_name, index_file=None, phred=0, build_index=True, full_index=False)
Read and parse fastq file

:param str file_name: input fastq file path
:param str file_name: input FASTQ file path

:param str index_file: the index file of FASTQ file, default using the index file with extension of .fxi in the same directory of FASTQ file. New in 2.0.0

:param bool build_index: build index for random access to FASTQ reads, default: ``True``. If build_index is False, iteration will return a tuple (name, seq, qual); If build_index is True, iteration will return a read object

Expand Down
31 changes: 22 additions & 9 deletions docs/source/changelog.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
Changelog
=========

Version 2.0.0 (2023-09-05)
--------------------------

- Added support for file name with wide char
- Added support for specifying index file path
- Added support for more characters in DNA sequence
- Added reverse complement function for DNA conversion
- Improved the performance of kseq library
- Optimized gzip index importing and saving without temp file
- Fixed segmentation fault when using sequence composition
- Fixed memory leak in Fastq read quality integer
- Fixed zlib download url broken error when building

Version 1.1.0 (2023-04-19)
--------------------------

Expand Down Expand Up @@ -33,35 +46,38 @@ Version 0.9.0 (2022-12-30)
- Fixed the quality score parsing error from fastq
- Fixed the reference of sequence returned from function

Older versions
--------------

Version 0.8.4 (2021-06-30)
--------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Added slice feature to FastaKeys
- Fixed FastaKeys and FastqKeys iteration memory leak
- Optimized FastaKeys and FastqKeys creation

Version 0.8.3 (2021-04-25)
--------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Fixed Fastx iteration for next function
- Fixed Fastx uppercase for reading fasta

Version 0.8.2 (2021-01-02)
--------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Fixed sample segfault error caused by fastq iteration error
- Fixed gzip index import error in multiple processes
- Fixed fastq iteration segfault error with full_name=True
- Fixed all objects iteration to support built-in next function

Version 0.8.1 (2020-12-16)
--------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Fixed pip install error from source code
- Removed support for python39 32bit due to dll load error

Version 0.8.0 (2020-12-15)
--------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Added Fastx object as a simple sequence iterator
- Added FastqKeys object to obtain read names
Expand All @@ -72,7 +88,7 @@ Version 0.8.0 (2020-12-15)
- Changed Identifier object to FastaKeys object

Version 0.7.0 (2020-09-20)
--------------------------
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Added support for extracting flank sequences
- Added support for indexing super large gzip file
Expand All @@ -81,9 +97,6 @@ Version 0.7.0 (2020-09-20)
- Fixed sequence dealloc error cuasing no fasta delloc trigger
- Fixed fastq max and min quality score return value

Older versions
--------------

Version 0.6.17 (2020-08-31)
^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
2 changes: 1 addition & 1 deletion src/index.c
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,7 @@ PyObject *pyfastx_index_get_seq_by_id(pyfastx_Index *self, Py_ssize_t chrom){
);

return (PyObject *)obj;
} else {
} else {
PYFASTX_SQLITE_CALL(sqlite3_reset(self->uid_stmt));
PyErr_SetString(PyExc_IndexError, "Index Error");
return NULL;
Expand Down
18 changes: 18 additions & 0 deletions src/module.c
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,27 @@ PyObject *pyfastx_gzip_check(PyObject *self, PyObject *args) {
Py_RETURN_FALSE;
}

PyObject *pyfastx_reverse_complement(PyObject *self, PyObject *args) {
const char *s;

PyObject *seq_obj;
PyObject *rc_obj;

if (!PyArg_ParseTuple(args, "O", &seq_obj)) {
return NULL;
}

s = PyUnicode_AsUTF8(seq_obj);
rc_obj = PyUnicode_FromString(s);
s = PyUnicode_AsUTF8(rc_obj);
reverse_complement_seq(s);
return rc_obj;
}

static PyMethodDef module_methods[] = {
{"version", (PyCFunction)pyfastx_version, METH_VARARGS | METH_KEYWORDS, NULL},
{"gzip_check", (PyCFunction)pyfastx_gzip_check, METH_VARARGS, NULL},
{"reverse_complement", (PyCFunction)pyfastx_reverse_complement, METH_VARARGS, NULL},
{NULL, NULL, 0, NULL}
};

Expand Down
2 changes: 1 addition & 1 deletion src/version.h
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#define PYFASTX_VERSION "1.2.0"
#define PYFASTX_VERSION "2.0.0"
#define ZRAN_VERSION "1.7.0"

0 comments on commit 575802e

Please sign in to comment.