-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Hi,
I ran into a behaviour that was unexpected to me, and wanted to check if it is intended (I didn't see any mention of it in the documentation, apologies if I missed it somewhere). We're sometimes working with ranges with negative coordinates (e.g., relative to a specific landmark), and the following works as expected:
> a <- IRanges(start = -5, end = 5)
> b <- IRanges(start = -10, end = 10)
> intersect(a, b)
IRanges object with 1 range and 0 metadata columns:
start end width
<integer> <integer> <integer>
[1] -5 5 11
However, for GRanges objects, intersect discards all negative coordinates:
> a <- GRanges(seqnames = "chr1", IRanges(start = -5, end = 5))
> b <- GRanges(seqnames = "chr1", IRanges(start = -10, end = 10))
> intersect(a, b)
GRanges object with 1 range and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 1-5 *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
Specifying seqlengths doesn't seem to make any difference, nor designating the chromosome as circular.
pintersect, on the other hand, keeps the negative coordinates for GRanges objects:
> pintersect(a, b)
GRanges object with 1 range and 1 metadata column:
seqnames ranges strand | hit
<Rle> <IRanges> <Rle> | <logical>
[1] chr1 -5-5 * | TRUE
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
However, here I noticed another thing that was slightly counterintuitive. While the following looks like I would expect:
> a <- GRanges(seqnames = "chr1", IRanges(start = c(-3, 5, 12), end = c(6, 12, 15)))
> b <- GRanges(seqnames = "chr1", IRanges(start = -10, end = 10))
> pintersect(a, b, drop.nohit.ranges = TRUE)
GRanges object with 2 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 -3-6 *
[2] chr1 5-10 *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
if I change the last range in a to start at position 11 (still not overlapping b) it would not be dropped:
> a <- GRanges(seqnames = "chr1", IRanges(start = c(-3, 5, 11), end = c(6, 12, 15)))
> b <- GRanges(seqnames = "chr1", IRanges(start = -10, end = 10))
> pintersect(a, b, drop.nohit.ranges = TRUE)
GRanges object with 3 ranges and 0 metadata columns:
seqnames ranges strand
<Rle> <IRanges> <Rle>
[1] chr1 -3-6 *
[2] chr1 5-10 *
[3] chr1 11-10 *
-------
seqinfo: 1 sequence from an unspecified genome; no seqlengths
Session info
R Under development (unstable) (2025-01-03 r87521)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Zurich
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] GenomicRanges_1.59.1 GenomeInfoDb_1.43.2 IRanges_2.41.2 S4Vectors_0.45.2 BiocGenerics_0.53.3
[6] generics_0.1.3
loaded via a namespace (and not attached):
[1] httr_1.4.7 compiler_4.5.0 R6_2.5.1 XVector_0.47.2
[5] tools_4.5.0 GenomeInfoDbData_1.2.13 rstudioapi_0.17.1 UCSC.utils_1.3.1
[9] jsonlite_1.8.9