Extending ngless and interacting with other projects [4/5]

NOTE
: As of Apr 2016, ngless is available
only as a pre-release to allow for testing of early versions by fellow scientists and to discusss ideas. We do not consider it /released/ and do not recommend use in production as some functionality is still in testing. Please get in touch
if you are interested in using ngless in your projects.

This is the first of a series of five posts introducing ngless.

  1. Introduction to ngless
  2. Perfect reproducibility using ngless
  3. Fast and high quality error detection
  4. Extending and interacting with other projects [this post]
  5. Miscellaneous

Extending and interacting with other projects

A frequently asked question
about ngless is whether the language is extensible. Yes, it is
. You can add modules using a simple text-only format ( YaML
). These modules can then add new functions to ngless. Behind the scenes, this results in command line calls to scripts you write.

For example, to integrate motus
into ngless, I used a simple configuration file
, which I am going to describe it here.

Every module has a name and a version:

name: 'motus'
version: '0.0.0'

You can add a citation text. This will be shown to all users of your module (citing the software you use is a best practice, so we support it):

citation: "Metagenomic species profiling using universal phylogenetic marker genes"

You can add an init
command. This will run before anything else runs at the start of the interpretation. It should be quick and check that things are OK. For example, in this case, we check that Python is installed. Thus, if there is a problem, the user gets a fast error message before anything else is run.

init:
    init_cmd: './check-python.sh'

Now, we list the functions we are implementing:

function:

In this case, there is just one, corresponding to the ngless function motus
.

    nglName: "motus"

arg0
is the command to run (which implements this function):

    arg0: './run-python.sh'

In ngless functions have a single unnamed argument and any number of named arguments. So, we specify first arg1 which is a special

    arg1:
        filetype: "tsv"
        can_gzip: true

The can_gzip
flag lets ngless know that it is OK to pass a compressed file to your script. Now, we list any additional arguments. In this case, there is a required argument:

    additional:
        -
            atype: 'str'
            name: 'ofile'
            def: ''
            required: true

The argument is a string, without a default. That’s it. Now, we can use the motus
function in a ngless script:

ngless "0.0"
import "motus" version "0.0.0"

input = paired('data/reads.1.fq.gz', 'data/reads.2.fq.gz')
preprocess(input, keep_singles=False) using |read|:
    read = substrim(read, min_quality=25)
    if len(read) < 45:
        discard

mapped = map(input, ref='motus')
mapped = select(mapped) using |mread|:
    mread = mread.filter(min_identity_pc=97)
counted = count(mapped, gff_file='motus.gtf.gz', features=['gene'], multiple={dist1})
motus(counted, ofile='motus-counts.txt')

What can modules do?

An external module can

  • add new functions (will result in a call to a script, which will often be a wrapper around some tool).
  • add new reference information (new catalogs, &c). This can even be downloaded on demand (currently [Apr 2016], the module init script must do this itself; in the future, ngless will support just a URL).
  • add a citation so that all users of the module will see the citation message. This ensures that if you develop a package which gets wrapped into an ngless module, those final users will still see your citation.
稿源:Meta Rabbit (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Extending ngless and interacting with other projects [4/5]

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录