<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.2 20190208//EN" "http://jats.nlm.nih.gov/archiving/1.2/JATS-archivearticle1.dtd">
<article article-type="brief-report" xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>microPublication Biology</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2578-9430</issn>
      <publisher>
        <publisher-name>Caltech Library</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.17912/W25Q2N</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Software</subject>
        </subj-group>
        <subj-group subj-group-type="subject">
          <subject>Software</subject>
        </subj-group>
        <subj-group subj-group-type="species">
          <subject>C. elegans</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Two new functions in the WormBase Enrichment Suite</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Angeles-Albores</surname>
            <given-names>David</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="corresp" rid="cor1">§</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Lee</surname>
            <given-names>Raymond Y.N.</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Chan</surname>
            <given-names>Juancarlos</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Sternberg</surname>
            <given-names>Paul W.</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff1">
          <label>1</label>
          Division of Biology and Biological Engineering, Caltech, Pasadena, CA, 91125, USA
        </aff>
      </contrib-group>
      <contrib-group>
        <contrib contrib-type="reviewer">
          <name>
            <surname>Roncaglia</surname>
            <given-names>Paola</given-names>
          </name>
        </contrib>
      </contrib-group>
      <author-notes>
        <corresp id="cor1">
          <label>§</label>
          Correspondence to: David Angeles-Albores (
          <email>dangeles@caltech.edu</email>
          )
        </corresp>
        <fn fn-type="con">
          <p>DA: </p>
          <p>RL: </p>
          <p>JC: </p>
          <p>PS: </p>
        </fn>
      </author-notes>
      <pub-date date-type="pub" publication-format="electronic">
        <day>27</day>
        <month>3</month>
        <year>2018</year>
      </pub-date>
      <pub-date date-type="collection" publication-format="electronic">
        <year>2018</year>
      </pub-date>
      <volume>2018</volume>
      <elocation-id>10.17912/W25Q2N</elocation-id>
      <history>
        <date date-type="received">
          <day>7</day>
          <month>2</month>
          <year>2018</year>
        </date>
        <date date-type="rev-recd">
          <day>1</day>
          <month>1</month>
          <year>1970</year>
        </date>
        <date date-type="accepted">
          <day>28</day>
          <month>2</month>
          <year>2018</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>Copyright: © 2018 by the authors</copyright-statement>
        <copyright-year>2018</copyright-year>
        <license license-type="open-access" xlink:href="https://creativecommons.org/licenses/by/4.0/">
          <license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.</license-p>
        </license>
      </permissions>
    </article-meta>
  </front>
  <body>
    <fig position="anchor" id="f1">
      <label>Figure 1. </label>
      <caption>
        <p>
          Enrichment results for Tissue, Gene and Phenotype ontologies for genes overexpressed in a ciliary transcriptome (Wang 
          <italic>et al.</italic>
           2015).
        </p>
      </caption>
      <graphic xlink:href="25789430-2018-W25Q2N"/>
    </fig>
    <sec>
      <title>Description</title>
      <p>
        <bold>&amp;#x200B;</bold>
        Genome-wide experiments routinely generate large amounts of data that can be hard to interpret biologically. A common approach to interpreting these results is to employ enrichment analyses of controlled languages, known as ontologies, that describe various biological parameters such as gene molecular or biological function. In 
        <italic>C. elegans</italic>
        , three distinct ontologies, the Gene Ontology (GO), Anatomy Ontology (AO), and the Worm Phenotype Ontology (WPO) are used to annotate gene function, expression and phenotype, respectively (Ashburner 
        <italic>et al</italic>
        . 2000; Lee and Sternberg, 2003; Schindelman 
        <italic>et al</italic>
        . 2011).
      </p>
      <p>
        Previously, we developed software to test datasets for enrichment of anatomical terms, called the Tissue Enrichment Analysis (TEA) tool (Angeles-Albores and Sternberg, 2016). Using the same hypergeometric statistical method, we extend enrichment testing to include WPO and GO, offering a unified approach to enrichment testing in 
        <italic>C. elegans</italic>
        . The WormBase Enrichment Suite can be accessed via a user-friendly interface at 
        <ext-link ext-link-type="uri" xlink:href="http://www.wormbase.org/tools/enrichment/tea/tea.cgi">http://www.wormbase.org/tools/enrichment/tea/tea.cgi</ext-link>
        .
      </p>
      <p>
        To validate the tools, we analyzed a previously published extracellular vesicle (EV)-releasing neuron (EVN)&amp;#xA0;signature gene set derived from dissociated ciliated EV neurons (Wang 
        <italic>et al</italic>
        . 2015) using WormBase Enrichment Suite based on the WS262 WormBase release. TEA correctly identified the CEM, hook sensillum and IL2 neuron as enriched tissues. The top phenotype associated with the EVN signature was chemosensory behavior. Gene Ontology enrichment analysis showed that cell projection and cell body were the most enriched cellular components in this gene set, followed by the biological processes neuropeptide signaling pathway and vesicle localization further down. The tutorial script used to generate the figure above can be viewed at:

        <ext-link ext-link-type="uri" xlink:href="https://github.com/dangeles/TissueEnrichmentAnalysis/blob/master/tutorial/Tutorial.ipynb">https://github.com/dangeles/TissueEnrichmentAnalysis/blob/master/tutorial/Tutorial.ipynb</ext-link>
      </p>
      <p>The addition of Gene Enrichment Analysis (GEA) and Phenotype Enrichment Analysis (PEA) to WormBase marks an important step towards a unified set of analyses that can help researchers to understand genomic datasets. These enrichment analyses will allow the community to fully benefit from the data curation ongoing at WormBase.</p>
    </sec>
    <sec>
      <title>Methods</title>
      <p>
        Using the methods described in Angeles-Albores 
        <italic>et al</italic>
        , we generated ontology dictionaries using the Anatomy, Phenotype and Gene Ontology annotations for 
        <italic>C. elegans</italic>
        . The dictionary similarity parameter was set to 95% for all ontologies. The annotation per term minimum was set to 33 annotations for the AO, a 50 annotations for the WPO, and 33 annotations for GO. Terms within the dictionary are tested using a hypergeometric probability test and corrected using the Benjamini-Hochberg step-up algorithm. In WS262, there are 1320 anatomy terms, 1117 phenotypes, and 3025 GO terms that have at least 11 genes annotated to them. The dictionaries are freely accessible using the Python version of the Suite, which can be installed using the pip tool for Python libraries:
      </p>
      <p>
        <code>pip install tissue_enrichment_analysis</code>
      </p>
      <p>The dictionary can then be automatically downloaded by importing the enrichment analysis library in a Python script by writing:</p>
      <p>
        <code>import tissue_enrichment_analysis as ea</code>
      </p>
      <p>The dictionaries can then be downloaded by typing:</p>
      <p>
        <code>ea.fetch_dictionary(dict)</code>
      </p>
      <p>into Python, where `dict ` is the string `tissue`, `phenotype` or `go` to specify which dictionary to download. If the function does not receive an argument, the dictionary corresponding to the AO is downloaded by default. See the tutorial above for an example implementation.</p>
    </sec>
  </body>
  <back>
    <ack>
      <sec>
        <title>Funding</title>
        <p>This work was supported by the NIH grant U41 HG002223.</p>
      </sec>
    </ack>
    <ref-list>
      <ref id="R1">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Angeles-Albores</surname>
              <given-names>D</given-names>
            </name>
            <name>
              <surname>N Lee</surname>
              <given-names>RY</given-names>
            </name>
            <name>
              <surname>Chan</surname>
              <given-names>J</given-names>
            </name>
            <name>
              <surname>Sternberg</surname>
              <given-names>PW</given-names>
            </name>
          </person-group>
          <year>2016</year>
          <month>9</month>
          <day>13</day>
          <article-title>Tissue enrichment analysis for C. elegans genomics.</article-title>
          <source>BMC Bioinformatics</source>
          <volume>17</volume>
          <issue>1</issue>
          <issn/>
          <fpage>366</fpage>
          <lpage>366</lpage>
          <pub-id pub-id-type="doi">10.1186/s12859-016-1229-9</pub-id>
          <pub-id pub-id-type="pmid">27618863</pub-id>
        </element-citation>
      </ref>
      <ref id="R2">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Ashburner</surname>
              <given-names>M</given-names>
            </name>
            <name>
              <surname>Ball</surname>
              <given-names>CA</given-names>
            </name>
            <name>
              <surname>Blake</surname>
              <given-names>JA</given-names>
            </name>
            <name>
              <surname>Botstein</surname>
              <given-names>D</given-names>
            </name>
            <name>
              <surname>Butler</surname>
              <given-names>H</given-names>
            </name>
            <name>
              <surname>Cherry</surname>
              <given-names>JM</given-names>
            </name>
            <name>
              <surname>Davis</surname>
              <given-names>AP</given-names>
            </name>
            <name>
              <surname>Dolinski</surname>
              <given-names>K</given-names>
            </name>
            <name>
              <surname>Dwight</surname>
              <given-names>SS</given-names>
            </name>
            <name>
              <surname>Eppig</surname>
              <given-names>JT</given-names>
            </name>
            <name>
              <surname>Harris</surname>
              <given-names>MA</given-names>
            </name>
            <name>
              <surname>Hill</surname>
              <given-names>DP</given-names>
            </name>
            <name>
              <surname>Issel-Tarver</surname>
              <given-names>L</given-names>
            </name>
            <name>
              <surname>Kasarskis</surname>
              <given-names>A</given-names>
            </name>
            <name>
              <surname>Lewis</surname>
              <given-names>S</given-names>
            </name>
            <name>
              <surname>Matese</surname>
              <given-names>JC</given-names>
            </name>
            <name>
              <surname>Richardson</surname>
              <given-names>JE</given-names>
            </name>
            <name>
              <surname>Ringwald</surname>
              <given-names>M</given-names>
            </name>
            <name>
              <surname>Rubin</surname>
              <given-names>GM</given-names>
            </name>
            <name>
              <surname>Sherlock</surname>
              <given-names>G</given-names>
            </name>
          </person-group>
          <year>2000</year>
          <month>5</month>
          <day>1</day>
          <article-title>Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.</article-title>
          <source>Nat Genet</source>
          <volume>25</volume>
          <issue>1</issue>
          <issn>1061-4036</issn>
          <fpage>25</fpage>
          <lpage>29}</lpage>
          <pub-id pub-id-type="doi">10.1038/75556</pub-id>
          <pub-id pub-id-type="pmid">10802651</pub-id>
        </element-citation>
      </ref>
      <ref id="R3">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Lee</surname>
              <given-names>RY</given-names>
            </name>
            <name>
              <surname>Sternberg</surname>
              <given-names>PW</given-names>
            </name>
          </person-group>
          <year>2003</year>
          <article-title>Building a cell and anatomy ontology of Caenorhabditis elegans.</article-title>
          <source>Comp Funct Genomics</source>
          <volume>4</volume>
          <issue>1</issue>
          <issn>1531-6912</issn>
          <fpage>121</fpage>
          <lpage>126}</lpage>
          <pub-id pub-id-type="doi">10.1002/cfg.248</pub-id>
          <pub-id pub-id-type="pmid">18629098</pub-id>
        </element-citation>
      </ref>
      <ref id="R4">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Schindelman</surname>
              <given-names>G</given-names>
            </name>
            <name>
              <surname>Fernandes</surname>
              <given-names>JS</given-names>
            </name>
            <name>
              <surname>Bastiani</surname>
              <given-names>CA</given-names>
            </name>
            <name>
              <surname>Yook</surname>
              <given-names>K</given-names>
            </name>
            <name>
              <surname>Sternberg</surname>
              <given-names>PW</given-names>
            </name>
          </person-group>
          <year>2011</year>
          <month>1</month>
          <day>24</day>
          <article-title>Worm Phenotype Ontology: integrating phenotype data within and beyond the C. elegans community.</article-title>
          <source>BMC Bioinformatics</source>
          <volume>12</volume>
          <issue/>
          <issn/>
          <fpage>32</fpage>
          <lpage>32</lpage>
          <pub-id pub-id-type="doi">10.1186/1471-2105-12-32</pub-id>
          <pub-id pub-id-type="pmid">21261995</pub-id>
        </element-citation>
      </ref>
      <ref id="R5">
        <element-citation publication-type="journal">
          <person-group person-group-type="author">
            <name>
              <surname>Wang</surname>
              <given-names>J</given-names>
            </name>
            <name>
              <surname>Kaletsky</surname>
              <given-names>R</given-names>
            </name>
            <name>
              <surname>Silva</surname>
              <given-names>M</given-names>
            </name>
            <name>
              <surname>Williams</surname>
              <given-names>A</given-names>
            </name>
            <name>
              <surname>Haas</surname>
              <given-names>LA</given-names>
            </name>
            <name>
              <surname>Androwski</surname>
              <given-names>RJ</given-names>
            </name>
            <name>
              <surname>Landis</surname>
              <given-names>JN</given-names>
            </name>
            <name>
              <surname>Patrick</surname>
              <given-names>C</given-names>
            </name>
            <name>
              <surname>Rashid</surname>
              <given-names>A</given-names>
            </name>
            <name>
              <surname>Santiago-Martinez</surname>
              <given-names>D</given-names>
            </name>
            <name>
              <surname>Gravato-Nobre</surname>
              <given-names>M</given-names>
            </name>
            <name>
              <surname>Hodgkin</surname>
              <given-names>J</given-names>
            </name>
            <name>
              <surname>Hall</surname>
              <given-names>DH</given-names>
            </name>
            <name>
              <surname>Murphy</surname>
              <given-names>CT</given-names>
            </name>
            <name>
              <surname>Barr</surname>
              <given-names>MM</given-names>
            </name>
          </person-group>
          <year>2015</year>
          <month>12</month>
          <day>10</day>
          <article-title>Cell-Specific Transcriptional Profiling of Ciliated Sensory Neurons Reveals Regulators of Behavior and Extracellular Vesicle Biogenesis.</article-title>
          <source>Curr Biol</source>
          <volume>25</volume>
          <issue>24</issue>
          <issn>0960-9822</issn>
          <fpage>3232</fpage>
          <lpage>3238}</lpage>
          <pub-id pub-id-type="doi">10.1016/j.cub.2015.10.057</pub-id>
          <pub-id pub-id-type="pmid">26687621</pub-id>
        </element-citation>
      </ref>
    </ref-list>
  </back>
</article>