Windows to Infinity

Category: Programming

Server: play.scriptsorcerers.xyz:10406

Flag: scriptCTF{i_10v3_m4_w1nd0wwwwwwww5_058aa4930d6d}

TL;DR

We connect to a TCP service that prints N = 1,000,000 integers (each in [0, 100000]). For a fixed window size M = N/2 = 500,000, we must output M+1 values per round for 8 sliding-window tasks: sum, xor, mean (floor), median (lower), mode (max value on ties), mex, distinct count, and sum of pairwise GCDs. We solved each round in streaming/online fashion with O(1) or O(log V) amortized updates and careful buffering for I/O.


Challenge Overview

The service:

  • Prints 1,000,000 numbers once.
  • For each of 8 rounds, reads one line of our answers (M+1 integers).
  • Verifies and continues; on any mismatch it prints uh oh and closes.
  • On success through Round 8, prints the flag.

Key parameters:

  • N = 1,000,000
  • M = N/2 = 500,000
  • Value domain: 0 … 100,000 (important for our data structures)

Approach & Data Structures (Round by Round)

We implemented a single C++ client that:

  • Opens a TCP socket to the host/port.
  • Streams in all N numbers (fast parser).
  • Streams out answers per round (custom buffered writer).

Round 1 — Sums of each window

  • Method: Prefix sums: pref[i+1] = pref[i] + a[i].
  • Window sum: sum[i] = pref[i+M] - pref[i].
  • Complexity: O(N).

Round 2 — XORs of each window

  • Method: Prefix XORs: px[i+1] = px[i] ^ a[i].
  • Window xor: xor[i] = px[i+M] ^ px[i].
  • Complexity: O(N).

Round 3 — Means (floored)

  • Method: Reuse sums; output sum[i] / M.
  • Complexity: O(N).

Round 4 — Median (lower median for even M)

  • Spec nuance: The problem defines median via a[floor(n/2)]. For even M this is the lower median.
  • Method: Fenwick/BIT over value domain [0..100000] tracking frequencies.
    • Update on slide: add(a[i+M]), remove(a[i]).
    • Query k-th (0-based) with k = (M-1)/2.
  • Complexity: O((N + #(slides)) · log V) ≈ O(N log 1e5).

Round 5 — Mode (break ties toward larger value)

  • Method: Frequency array cnt[v], plus a lazy heap of pairs (-count, -value):
    • On add/remove, push the new (−cnt[v], −v) into a min-heap.
    • When querying, pop until the top matches current cnt (lazy deletion).
    • Tie-breaking by larger value is handled by value.
  • Complexity: Amortized O(log V) per update/query.

Round 6 — Mex (minimum excluded)

  • Method: Maintain cnt[v] and a min-heap missing seeded with all v≥0.
    • On add: cnt[v]++.
    • On remove: if cnt[v] becomes 0, push v back into missing.
    • Query: pop from missing until top has cnt[top]==0.
  • Complexity: Amortized O(log V).

Round 7 — # of Distinct Numbers

  • Method: cnt[v] and an integer distinct. Increment when a count goes 0→1, decrement when 1→0.
  • Complexity: O(1) per update.

Round 8 — Sum of pairwise GCDs in each window

We use a classic identity over positive integers:

∑i<jgcd⁡(xi,xj)  =  ∑d≥1φ(d) (F[d]2)\sum_{i<j} \gcd(x_i, x_j) \;=\; \sum_{d\ge 1} \varphi(d)\,\binom{F[d]}{2}

i<j∑gcd(xi,xj)=d≥1∑φ(d)(2F[d])

where F[d] = number of elements divisible by d in the window, and φ is Euler’s totient.

Zero handling: gcd(0, x) = x and gcd(0,0)=0. We exclude zeros from F and add a term for zero–positive pairs:

answer  =  ∑d≥1φ(d) (F[d]2)  +  (#zeros)⋅∑xi>0xi\text{answer} \;=\; \sum_{d\ge1}\varphi(d)\,\binom{F[d]}{2} \;+\; (\#\text{zeros}) \cdot \sum_{x_i>0} x_i

answer=d≥1∑φ(d)(2F[d])+(#zeros)⋅xi>0∑xi

Implementation:

  • Precompute phi[d] (linear sieve) and divisors divs[v] for all v≤100000.
  • Maintain:
    • F[d] counts for all d.
    • Spos = \sum_d φ(d)*C2(F[d]) incrementally:
      • On add v>0: for each d | v, Spos += φ(d) * F[d], then F[d]++.
      • On remove v>0: first F[d]--, then Spos -= φ(d) * F[d].
    • zeros and sum_pos for the zero term.

Complexity: Each add/remove touches all divisors of v (≈ up to ~128 in worst/common cases). With 1e6 scale and buffered I/O it’s fine.


I/O & Engineering Notes

  • Fast input: Manual socket recv() with a simple digit parser (numbers separated by spaces/newlines).
  • Fast output: Buffered line writer; flush per round.
  • Domain-bounded DS: All value-indexed structures sized to MAXV+1 = 100001.
  • Pitfalls we hit (and fixed):
    • A stray } (and a pasted diff marker }) accidentally closed main() early → compile errors & missing cli scope. Fix: remove the extra brace/line so the “tail read” remains inside main.
    • Median definition: the judge uses lower median for even M. Setting k = (M-1)/2 is required; using upper median caused mismatches.

Build & Run

g++ -O3 -std=c++20 -march=native -pipe windows_to_infinity_solver.cpp -o solver
./solver play.scriptsorcerers.xyz 10406

You should see the round banners, then the flag:

Round 1: Sums!
...
Round 8: Sum of pairwise GCD!
scriptCTF{i_10v3_m4_w1nd0wwwwwwww5_058aa4930d6d}

Complexity Summary

  • Rounds 1–3: O(N)
  • Median: O((N) log V)
  • Mode: Amortized O(N log V)
  • Mex: Amortized O(N log V)
  • Distinct: O(N)
  • Pairwise GCD sum: ~O(N · avg_divisors(v))

With V=100001, all rounds finish comfortably when paired with fast I/O.


Lessons Learned

  • Clarify median definition (lower vs upper) early; it’s a common off-by-one trap.
  • For large sliding windows, data structures on the value domain (BIT, heaps with lazy deletion, divisor/phi precomp) are the way to go.
  • Zero handling in number-theory aggregates matters (gcd term).
  • Avoid leaking debug/diff artifacts into source (the } line).