Pineal gland; day and night; and mixed tissue, day; polyA (Rat)
Specimens
Pools of 6 rat pineal glands were collected at mid-day (ZT7) and mid-night (ZT19) from animals housed in a 14:10 light:dark lighting cycle. In addition, a mixed tissue sample was prepared from material collected from three rats at ZT7. The tissues were: cortex, cerebellum, midbrain, hypothalamus, hindbrain, spinal cord, retina, pituitary, heart, liver, lung, kidney, skeletal muscle, small intestine, adrenal gland.
RNA Preparation
Samples of tissue from the three animals were mixed. Total RNA was extracted using an RNeasy Kit with on-column DNase treatment. To prepare the mixed tissue sample, equal amounts of RNA from each of the 15 tissues were combined.
RNA-Seq library preparation and sequencing
Total RNA (10 ug) from the pineal gland pools and mixed tissue samples was polyA-selected, fragmented and sequenced on an Illumina GAII machine, yielding paired-end 51-mer reads.
Bioinformatics methods
See bioinformatics methods for details.
lncRNA
To identify lncRNAs in the original study, reads were aligned to the rat rn4 assembly and used a windowing method along with manual curation and thresholding to detect candidate lncRNAs. An alternative approach is presented here, aligning to the rn6 assembly and taking advantage of recent advantages in software tools.
For each sample, StringTie was used to generate a GFF file for each sample (rat-polyA day, night, and mixed) from the aligned BAM files. Identified transcripts from each samples were combined and then merged with the reference annotation for rn6. In addition, the original lncRNA coordinates, which were originally aligned to rn4, were lifted over to rn5 and then to rn6. The minimum ratio of bases that must be kept was lowered to 0.5 to keep all lncRNAs. StringTie transcripts that overlapped a known gene were filtered out with bedtools intersect -v.
On these data, StringTie reports many fragmented regions that should be merged into larger lncRNAs to be consistent with the manual curation from the original lncRNA calls. To determine appropriate parameters for merging, bedtools intersect was used to select StringTie peaks overlapping with the originally-defined, lifted-over lncRNA coordinates. Based on the histogram of inter-feature distances within this subset, 25% of distances were within 863.0 bp, 50% at 3688 bp and 75% at 14219 bp.
The un-merged StringTie transcripts were sorted and filtered to remove any that overlapped with a known gene, and then merged using merge distances of 2.5kb, 5kb, 10kb, and 15kb, consistent with the intra-lncRNA inter-feature distances described above. The track hub linked above shows the 10kb merge version.
References
(1) Coon SL, Munson PJ, Cherukuri PF, Sugden D, Rath MF, Møller M, Clokie SJ, Fu C, Olanich ME, Rangel Z, Werner T; NISC Comparative Sequencing Program, Mullikin JC, Klein DC. Circadian changes in long noncoding RNAs in the pineal gland Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13319-24.doi: 10.1073/pnas.1207748109. Epub 2012 Aug 3.
Note: In some cases, the mixed tissue values for selected genes are exceptionally high because that gene is very highly expressed in only one tissue, e.g., Rho and GH are strongly expressed only in the retina and pituitary gland, respectively.