• No results found

Additional File 1: Containing Figures S1 thru S7

N/A
N/A
Protected

Academic year: 2020

Share "Additional File 1: Containing Figures S1 thru S7"

Copied!
7
0
0

Loading.... (view fulltext now)

Full text

(1)

Additional File 1: Containing Figures S1 thru S7

Figure S1: A) Pairwise comparisons of ATAC-seq, Start-seq, and nuclear RNA-seq for annotated coding genes. Boxes on diagonal show density plots of log signal enrichment for each assay. Boxes in on upper right show correlation values for each pairwise comparison. Boxes on lower left show contour plots representing scatterplots of log enrichment for each pairwise comparison. B) ATAC-seq and Start-seq signal accumulation at genes stratified into quintiles by nuclear RNA-seq signal. As is evident from lower ATAC-seq signal in the highest expression quintile, and by low stratification of lower quintiles by Start-seq, ATAC-seq and Start-seq are inconsistently correlated with nuclear RNA-seq. ATA C star tseq n ucRNAseq

ATAC startseq nucRNAseq

0 1 2 3 Corr: 0.458 Corr: 0.544 1 2 3 4 5 Corr: 0.667 1 2 3 1.0 1.5 2.0 2.5 3.0 3.5 1 2 3 4 0 2 4 Coun ts Coun ts A B

Supplementary Figure 1

(2)

Figure S2: A) Representative browser window example of gene with low pause index. Note low Start-seq and high nuclear RNA-seq values. B) Representative browser window example of gene with high pause index. Note high Start-seq and low nuclear RNA-seq values. C) Enrichment of “Pause Button” (PB) motif based on pause index (PI) quartiles. Bars are partitioned based on z-score of highest log-odds score for enrichment of the PB motif in every nuTSS in the cohort. High PI quartile (right) disproportionately contains nuTSSs with high PB z-scores. 0.00 0.25 0.50 0.75 1.00 1 2 3 4

Pause index quartile (low to high)

count V38 > 0.25 0 − 0.25 −0.25 − 0 < −0.25 Pause Button motif prevalence by

pause index quartile

Pr

opor

tion

1 2 3

Pause index quartile (low to high)4 C ATAC-seq Start-seq Nuclear RNA-seq 122 0 0 0 8000 14500 ATAC-seq Start-seq Nuclear RNA-seq 25 0 0 0 8000 14500

nAChRalpha3: Pause index = 226 Myosin heavy chain: Pause index = 0.02 A B = > 0.25 = 0 - 0.25 = -0.25 - 0 = < -0.25 Pause button motif enrichment z-score:

(3)

Figure S3: Comparison of the percentage of peaked (green points), broad (red), and weak (blue) clusters detected in this study (x-axis) as compared to Hoskins et al. [13] (A) or Nechaev et al. [19] (B) (y-axis). For each comparison, datasets were subsampled at equal read depth, signal was assigned to peaks using the same method, and the cluster distance threshold was varied (as denoted by colored point borders).

Supplementary Figure 3

● ● ● ● ● ● ● ● ●●● ●●● ● ● ● ● ●● ● 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Meers Hoskins threshold 5 10 15 25 50 75 100 type ● ● ● Broad Peaked Weak ● ● ● ● ● ● ● ● ● ●●●●● ● ● ●● ●●● 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Meers Nechae v threshold 5 10 15 25 50 75 100 type ● ● ● Broad Peaked Weak

Cluster type percentages, Meers et al. vs. Hoskins et al.

A B Distance threshold (nt) Distance threshold (nt) Cluster type Cluster type

Cluster type percentages, Meers et al. vs. Nechaev et al.

Percentage of total clusters (Meers et al.)

Percentage of total clusters (Meers et al.) Per cen tage of t otal clust ers (Hosk ins et al .) Per cen tage of t otal clust ers (Nechaev et al .)

(4)

Figure S4: A) Correlation matrix of enrichment of 16 motifs used in clustering of obsTSSs. Dashed black lines denote group motifs co-occurring more frequently than others. B) Consensus motifs at obsTSSs based on motif-derived clustering. Cluster 1 resembles INR motif (TCAGT) from position -1 to 3, whereas other groups have lower sequence information content. A Cluster 1 Cluster 2 Cluster 3 B

Supplementary Figure 4

(5)

Figure S5: A) Metaplot of sense or antisense Start-seq signal mapping in a 500nt window around obsTSSs (teal and purple, respectively) or nuTSSs (salmon and green, respectively). B) Boxplot describing fraction of sense-oriented reads mapping in a 400nt window around nuTSSs (salmon) or obsTSSs (teal). C) Venn diagram describing representation of paired TSSs within high-confidence TSSs detected by Start-seq. D) Boxplots describing Start-seq (left) and ATAC-seq (right) signal corresponding to non-divergent obsTSSs (grey), divergent obsTSSs (i.e. obsTSSs paired with nuTSS, light green), and bidirectional obsTSSs (dark green). E) Metaplot describing BEAF32 ChIP-seq signal mapping in a 2kb window surrounding non-divergent (grey), divergent (light green), and bidirectional (dark green) obsTSSs. F) Histograms describing the distributions of distance between divergent (green) and convergent (purple) pairs of nearest neighbor obsTSSs oriented in opposite directions.

(6)

Figure S6: A) Metaplots describing ATAC-seq signal (top) and predicted nucleosome occupancy (bottom) in a 1 kb window surrounding obsTSSs (teal) or nuTSSs (red). B) Boxplot describing distribution of distances from the nearest obsTSS for all nuTSSs in each of 7 histone PTM-defined clusters described in Figure 5A. Negative values indicate nuTSSs that are upstream of the nearest obsTSS, while positive values indicate nuTSSs that are downstream of the nearest obsTSS. C) Nuclear RNA-seq reads were mapped to regions 200nt upstream and downstream of each TSS, and enrichment of downstream vs. upstream reads was analyzed as a proxy for elongation. At right: scatterplots of nuclear RNA-seq upstream reads (X-axis) vs. downstream reads (Y-axis) for nuTSSs (left, salmon) and obsTSSs (right, teal). D) Representative example of a nuTSS belonging to Cluster 1 (red box) that corresponds to a TSS that has been newly annotated in the latest D. melanogaster genome build (green box). ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −10000 −1000 −500 −200 −100 −10 −1 0 1 10 100 200 500 1000 10000 1 2 3 4 5 6 7 V21 V22 V21 1 2 3 4 5 6 7 A B C D Distance of nuTSSs in PTM clusters to nearest obsTSS

Upstr eam D ownstr eam Distanc e (n t) ATAC-seq Start-seq forward strand Nuclear RNA-seq forward strand 360 0 0 0 4070 250 Start-seq reverse strand 0 770 Nuclear RNA-seq reverse strand 0 440

Old gene model

New gene model

= obsTSS = nuTSS, Cluster 1 = newly annotated transcript Newly annotated start site: Fmr1 -0.5 TSS 0.5Kb 50 75 100 125 150 175 HWT ATAC-seq obsTSSs nuTSSs Nor m. A TA C-seq sig nal -0.5 TSS 0.5Kb 0.4 0.5 0.6

0.7 Predicted nucleosome occupancy obsTSSs nuTSSs

Nucleosome likelihood

HWT ATAC-seq

(7)

Figure S7: A) Percentage of all FlyLight enhancers (red) or those overlapping with a high-confidence nuTSS (blue) that correspond to each of several larval expression categories. B) Heatmap describing sense (right) or antisense (left) Start-seq signal mapping in a 500nt window around enhancer-associated nuTSSs. 100000 10000 1000 100 10 0 Antisense

Start-seq signal Sense Start-seq signal

Enhancer-associated nuTSSs -250 nt TSS 250 nt -250 nt TSS 250 nt A B Counts

Figure

Figure	 S2:	 A)	 Representative	 browser	 window	 example	 of	 gene	 with	 low	 pause	 index.	 Note	 low	 Start-seq	 and	 high	 nuclear	 RNA-seq	 values.	 B)	 Representative	 browser	 window	 example	 of	 gene	 with	 high	 pause	 index.	 Note	 high	 Start-
Figure	 S5:	A)	Metaplot	of	sense	or	antisense	Start-seq	signal	mapping	in	a	500nt	window	around	 obsTSSs	 (teal	 and	 purple,	 respectively)	 or	 nuTSSs	 (salmon	 and	 green,	 respectively).	 B)	 Boxplot	 describing	fraction	of	sense-oriented	reads	mapping
Figure	 S6:	 A)	 Metaplots	 describing	 ATAC-seq	 signal	 (top)	 and	 predicted	 nucleosome	 occupancy	 (bottom)	 in	 a	 1	 kb	 window	 surrounding	 obsTSSs	 (teal)	 or	 nuTSSs	 (red).	 B)	 Boxplot	 describing	 distribution	of	distances	from	the	nearest	ob
Figure	 S7:	 A)	 Percentage	 of	 all	 FlyLight	 enhancers	 (red)	 or	 those	 overlapping	 with	 a	 high- high-confidence	 nuTSS	 (blue)	 that	 correspond	 to	 each	 of	 several	 larval	 expression	 categories.	 B)	 Heatmap	describing	sense	(right)	or	antis

References

Related documents

This section outlines the method to find the best allocation of n distinguishable processors to m dis- tinguishable blocks so as to minimize the execution time.. Therefore,

The main predictions look at the level of identification that charity street donation collectors have with their job role (e.g., no ID, uniform/t-shirt, ID badge, and

As shown in this study, loyalty to the organization resulting from merger or acquisition has different intensity level for employees in different hierarchical

Recording Data for New or Revised Course (Record only new or changed course information.) Course prefix (3 letters) Course Number (3 Digits) Effective Term (Example:

The Department of Health, Physical Education, Recreation and Dance offers a Master of Science in Education in Health and Physical Education and a Master of Science in

We are now using the second part of our test database (see Figure 4 ) ; the boxpoints table which contains 3000 customer points, and the box table with 51 dierent sized bounding

Most companies recruit for full-time and internship positions, but some indicate Co-Op as a recruiting priority, while not attending Professional Practice

This authority will remain in effect until ST JOSEPH CATHOLIC CHURCH OF RAYNE LOUISIANA is notified by me (us) in writing to cancel it in such time as to afford ST JOSEPH