Widespread retention of ohnologs in key developmental gene families following whole genome duplication in arachnopulmonates

Whole genome duplications (WGD) have occurred multiple times in the evolution of animals, including in the lineages leading to vertebrates, teleosts, horseshoe crabs and arachnopulmonates. These dramatic genomic events initially produce a wealth of new genetic material, which is generally followed by extensive gene loss. It appears that developmental genes such as homeobox genes, signalling pathway components and microRNAs, however, tend to be more frequently retained in duplicate following WGD (ohnologs). These not only provide the best evidence for the occurrence of WGD, but an opportunity to study its evolutionary implications. Although these genes are relatively well studied in the context of vertebrate WGD, genomic and transcriptomic data for independent comparison in other groups are scarce, with patchy sampling of only two of the five extant arachnopulmonate orders. To improve our knowledge of developmental gene repertoires, and their evolution since the arachnopulmonate WGD, we sequenced embryonic transcriptomes from two additional spider species and two whip spider species and surveyed them for three important gene families: Hox, Wnt and frizzled. We report extensive retention of ohnologs in all four species, further supporting the arachnopulmonate WGD hypothesis. Thanks to improved sampling we were able to identify patterns of likely ohnolog retention and loss within spiders, including apparent differences between major clades. The two amblypygid species have larger ohnolog repertoires of these genes than both spiders and scorpions; including the first reported duplicated Wnt1/wg, the first Wnt10 recovered in an arachnid, and broad retention of frizzled genes. These insights shed light on the evolution of the enigmatic whip spiders, highlight the importance of the comparative approach within lineages, and provide substantial new transcriptomic data for future study.

arachnopulmonate WGD, we sequenced embryonic transcriptomes from two additional spider species 27 and two whip spider species and surveyed them for three important gene families: Hox, Wnt and 28 frizzled. We report extensive retention of ohnologs in all four species, further supporting the 29 arachnopulmonate WGD hypothesis. Thanks to improved sampling we were able to identify patterns of

37
The duplication of genetic material is widely accepted to be an important contributor to the evolution of 38 morphological and physiological innovations (Ohno 1970;Zhang 2003      Several Wnt families have also been retained after the 1R and 2R events in vertebrates, for example 120 there are two copies each of Wnt2, Wnt3, Wnt5, Wnt7, Wnt8, Wnt9, and Wnt10 in humans (Miller 2001; 121 Janssen et al. 2010). However, no subfamilies are represented by three or four copies in humans and 122 so there is some consistency with arachnopulmonates in that the Wnts may be more conservative 123 markers of WGD, to be used in combination with Hox and other homeobox genes.

124
The extensive and consistent retention of key developmental genes like Hox genes apparent in P.

223
To explore the extent of duplication in these arachnopulmonates we then surveyed the copy number of

252
Embryos of C. acosta were collected at multiple stages of development, supporting the hypothesis that 253 this may be a true loss, rather than absence of expression at a particular developmental stage.
However, the apparent additional absence of Scr and Ubx duplicates in E. bacillifer could equivocally 255 indicate lineage-specific losses or absence of expression at a single timepoint.
The duplication of Hox genes is consistent among the three arachnopulmonate orders studied to date, 257 and specific repertoires appear to be fairly conserved at the order level (this study; Schwager et al.

331
The resolution of the Ce. sculpturatus Wnt1 paralogs had lower support and their relationship is 332 therefore more ambiguous. The current placement of Cs-Wnt1-2 as sister to Cs-Wnt1-1+(Ca-Wnt1-333 1+Eb-Wnt1-1) lends support to a lineage-specific duplication, but support for this topology is middling 334 (75%, Figure 4), and it is noteworthy that the two Ce. sculpturatus sequences are recovered from 335 different genomic scaffolds (see Supplementary Table 2 for accession numbers).

336
The presence of Wnt10 in both amblypygids is also intriguing because it is absent from all other

383
Analysis of the M. muscosa transcriptome also returned a second copy of fz2, which is not shared by 384 any other arachnid to date. These two paralogs form a well-supported clade (95%, Figure 6), indicating 385 that this is the result of a lineage-specific tandem duplication followed by rapid sequence divergence in

393
Both amblypygid species have a large repertoire of frizzled genes compared to other arachnids, 394 expressing all four orthology groups with two copies each of fz1, fz3 and fz4 (Figures 5-6). Duplicates

395
of fz1 and fz3 appear to be unique to amblypygids. The fz1 duplicates could be ohnologs retained from 396 the arachnopulmonate WGD, as they form separate clades with the fz1 genes of other 397 arachnopulmonates and exhibit reasonable sequence divergence (support values ≥98%, paralog sequence similarity 76-77%; Figure 6). The origin of the fz3 duplication is less clear; although the four which has low support (35%; Figure 6). Therefore we cannot yet confirm the timing of the duplication.

406
Supplementary Data File 4). We propose that these duplicates are probably retained from the ancestral

413
we cannot conclusively identify their origin but we hypothesise that these fz4 duplicates reflect a 414 lineage-specific duplication in either the ancestor of amblypygids or that of Pedipalpi (the larger clade 415 to which amblypygids belong).

416
Previous studies of spiders, scorpions, and ticks indicated that frizzled repertoires in these groups are 417 restricted to three or four copies, often with incomplete representation of the four orthology groups.

418
Analysis of the new transcriptomes for the spiders M. muscosa and Pa. amentata is consistent with this 419 pattern, albeit with a unique duplication of fz2 in the jumping spider. We also recovered a single copy 420 of fz2 in Ce. sculpturatus, which was missing from previous work on Me. martensii (Janssen et al. 2015).

421
The absence of fz2 in the latter could result from a lineage-specific loss or an issue with genome

477
Although they are unlikely to be directly responsible, the divergence in gene repertoires we see between

478
Ph. phalangioides and the other spider lineages might provide a starting point for understanding these 479 important morphological differences.

480
The amblypygids emerge as a key group of interest for studying the impacts of WGD owing to their high

505
This was the case for Wnt1/wg, Wnt6, Wnt7, and potentially fz2 (Figures 4,6). However, levels of 506 sequence similarity in these cases were comparable for Ce. sculpturatus paralogs and amblypygid and 507 spider ohnologs, when we might expect within-lineage duplicates to show higher similarity. The