Windows to Infinity

Category: Programming

Server: play.scriptsorcerers.xyz:10406

Flag: scriptCTF{i_10v3_m4_w1nd0wwwwwwww5_058aa4930d6d}

TL;DR

We connect to a TCP service that prints N = 1,000,000 integers (each in [0, 100000]). For a fixed window size M = N/2 = 500,000, we must output M+1 values per round for 8 sliding-window tasks: sum, xor, mean (floor), median (lower), mode (max value on ties), mex, distinct count, and sum of pairwise GCDs. We solved each round in streaming/online fashion with O(1) or O(log V) amortized updates and careful buffering for I/O.

Challenge Overview

The service:

Prints 1,000,000 numbers once.

For each of 8 rounds, reads one line of our answers (M+1 integers).

Verifies and continues; on any mismatch it prints uh oh and closes.

On success through Round 8, prints the flag.

Key parameters:

N = 1,000,000

M = N/2 = 500,000

Value domain: 0 … 100,000 (important for our data structures)

Approach & Data Structures (Round by Round)

We implemented a single C++ client that:

Opens a TCP socket to the host/port.

Streams in all N numbers (fast parser).

Streams out answers per round (custom buffered writer).

Round 1 — Sums of each window

Method: Prefix sums: pref[i+1] = pref[i] + a[i].

Window sum: sum[i] = pref[i+M] - pref[i].

Complexity: O(N).

Round 2 — XORs of each window

Method: Prefix XORs: px[i+1] = px[i] ^ a[i].

Window xor: xor[i] = px[i+M] ^ px[i].

Complexity: O(N).

Round 3 — Means (floored)

Method: Reuse sums; output sum[i] / M.

Complexity: O(N).

Round 4 — Median (lower median for even M)

Spec nuance: The problem defines median via a[floor(n/2)]. For even M this is the lower median.

Method: Fenwick/BIT over value domain [0..100000] tracking frequencies.
- Update on slide: add(a[i+M]), remove(a[i]).
- Query k-th (0-based) with k = (M-1)/2.

Complexity: O((N + #(slides)) · log V) ≈ O(N log 1e5).

Round 5 — Mode (break ties toward larger value)

Method: Frequency array cnt[v], plus a lazy heap of pairs (-count, -value):
- On add/remove, push the new (−cnt[v], −v) into a min-heap.
- When querying, pop until the top matches current cnt (lazy deletion).
- Tie-breaking by larger value is handled by value.

Complexity: Amortized O(log V) per update/query.

Round 6 — Mex (minimum excluded)

Method: Maintain cnt[v] and a min-heap missing seeded with all v≥0.
- On add: cnt[v]++.
- On remove: if cnt[v] becomes 0, push v back into missing.
- Query: pop from missing until top has cnt[top]==0.

Complexity: Amortized O(log V).

Round 7 — # of Distinct Numbers

Method: cnt[v] and an integer distinct. Increment when a count goes 0→1, decrement when 1→0.

Complexity: O(1) per update.

Round 8 — Sum of pairwise GCDs in each window

We use a classic identity over positive integers:

∑i<jgcd⁡(xi,xj) = ∑d≥1φ(d) (F[d]2)\sum_{i<j} \gcd(x_i, x_j) \;=\; \sum_{d\ge 1} \varphi(d)\,\binom{F[d]}{2}

i<j∑gcd(xi,xj)=d≥1∑φ(d)(2F[d])

where F[d] = number of elements divisible by d in the window, and φ is Euler’s totient.

Zero handling: gcd(0, x) = x and gcd(0,0)=0. We exclude zeros from F and add a term for zero–positive pairs:

answer = ∑d≥1φ(d) (F[d]2) + (#zeros)⋅∑xi>0xi\text{answer} \;=\; \sum_{d\ge1}\varphi(d)\,\binom{F[d]}{2} \;+\; (\#\text{zeros}) \cdot \sum_{x_i>0} x_i

answer=d≥1∑φ(d)(2F[d])+(#zeros)⋅xi>0∑xi

Implementation:

Precompute phi[d] (linear sieve) and divisors divs[v] for all v≤100000.

Maintain:
- F[d] counts for all d.
- Spos = \sum_d φ(d)*C2(F[d]) incrementally:
  - On add v>0: for each d | v, Spos += φ(d) * F[d], then F[d]++.
  - On remove v>0: first F[d]--, then Spos -= φ(d) * F[d].
- zeros and sum_pos for the zero term.

Complexity: Each add/remove touches all divisors of v (≈ up to ~128 in worst/common cases). With 1e6 scale and buffered I/O it’s fine.

I/O & Engineering Notes

Fast input: Manual socket recv() with a simple digit parser (numbers separated by spaces/newlines).

Fast output: Buffered line writer; flush per round.

Domain-bounded DS: All value-indexed structures sized to MAXV+1 = 100001.

Pitfalls we hit (and fixed):
- A stray } (and a pasted diff marker }) accidentally closed main() early → compile errors & missing cli scope. Fix: remove the extra brace/line so the “tail read” remains inside main.
- Median definition: the judge uses lower median for even M. Setting k = (M-1)/2 is required; using upper median caused mismatches.

Build & Run

g++ -O3 -std=c++20 -march=native -pipe windows_to_infinity_solver.cpp -o solver
./solver play.scriptsorcerers.xyz 10406

You should see the round banners, then the flag:

Round 1: Sums!
...
Round 8: Sum of pairwise GCD!
scriptCTF{i_10v3_m4_w1nd0wwwwwwww5_058aa4930d6d}

Complexity Summary

Rounds 1–3: O(N)

Median: O((N) log V)

Mode: Amortized O(N log V)

Mex: Amortized O(N log V)

Distinct: O(N)

Pairwise GCD sum: ~O(N · avg_divisors(v))

With V=100001, all rounds finish comfortably when paired with fast I/O.

Lessons Learned

Clarify median definition (lower vs upper) early; it’s a common off-by-one trap.

For large sliding windows, data structures on the value domain (BIT, heaps with lazy deletion, divisor/phi precomp) are the way to go.

Zero handling in number-theory aggregates matters (gcd term).

Avoid leaking debug/diff artifacts into source (the } line).