❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayHacker News - Newest: "SSD"

Building a Web Framework from Scratch

3 May 2026 at 21:38

Draco is a Hack Club (https://hackclub.com) YSWS (You Ship We Ship) β€” teenagers build a working server side web framework from scratch. Ship it, and we send you a mechanical keyboard + SSD.

The idea came from building Beasty β€” my own HTTP server from raw TCP. The moment you parse your first request line by hand and a browser actually responds, something clicks. You stop thinking of HTTP as magic and start thinking of it as bytes. That's the feeling I want 50 teenagers to have.

The project has 6 milestones β€” from opening a TCP socket all the way to middleware and custom routing. Doable in a weekend if you're motivated, ~15 hours of focused work.

Draco site: [https://github.com/NewSmoke38/draco]


Comments URL: https://news.ycombinator.com/item?id=48001803

Points: 7

# Comments: 1

Ask HN: Will hardware ever be cheap again?

By: bjourne
27 April 2026 at 21:55

Up until about 2015 it felt like hardware was always getting cheaper. Then something happened and hardware stopped getting cheaper every year. Adjusted for inflation a mid-range laptop or desktop costs me much more today than it did back then. Yes, it has better specs but it needs it to make up for the extra bloat of all applications.

Given DRAM, CPU, GPU, and SSD shortages it does not seem hardware will become cheaper in the short term. Do you think it will ever go back to how it was ten years ago or is this the new normal?


Comments URL: https://news.ycombinator.com/item?id=47927899

Points: 30

# Comments: 8

Show HN: WayInfer – Native GGUF engine that runs models larger than your RAM

By: ahmedm24
2 April 2026 at 14:24

We built a native inference engine that runs quantized LLMs directly from SSD using memory-mapped I/O. The model never fully loads into RAM β€” the OS pages weights on demand as each layer executes.

*What it does:* - Mixtral 8x22B (80GB, 141B params) runs on a machine with 48GB RAM - Model loads in 0.3 seconds (vs 190s with llama.cpp) - Produces correct output: "What is 2+2?" β†’ "The sum of 2 and 2 is 4." - Zero dependencies β€” custom tensor engine, custom GGUF parser, no ggml/llama.cpp

*How it works:* - `mmap()` the GGUF file. The OS handles SSD→RAM paging transparently - Quantize the input to Q8_K, compute dot products directly against Q4_K/Q5_K/Q6_K weights in the quantized domain — no dequantization to float32 - AVX2 SIMD + 8-thread parallel matvec - For MoE models: only 2 of 8 experts are active per token, so most weights stay cold on disk

*The hard part we solved:* GGUF models are calibrated for a specific dot product computation path (ggml's "quantize input β†’ integer multiply-accumulate β†’ late float conversion"). If you naively dequantize weights to float32 and do a standard dot product, the per-operation error is tiny (~0.001%) but compounds across 56 transformer layers into completely wrong output. We had to reverse-engineer and match ggml's exact scalar computation β€” block-level integer accumulation with 8-lane parallel reduction β€” to get correct results.

*What it doesn't do (yet):* - Speed: ~0.08 tok/s on the 80GB model (CPU-only, no GPU offload) - No interactive chat UI - Only K-quant GGUF formats (Q4_K_M, Q5_K_M, Q6_K β€” covers ~90% of models on HuggingFace) - Windows only (Linux stubs exist but untested)

The architecture comes from my "work in progress" WayOS (https://github.com/cloudlinqed/WayOS), an AI-first OS that treats SSD/RAM/VRAM as a unified memory hierarchy.

GitHub: https://github.com/cloudlinqed/WayInfer


Comments URL: https://news.ycombinator.com/item?id=47614947

Points: 1

# Comments: 0

Show HN: Open-source encrypted backup CLI

By: loichrn
16 March 2026 at 13:13

I’ve been building an open-source backup CLI in Go: https://github.com/Cloudstic/cli

Docs: https://docs.cloudstic.com

Features:

  - encrypted backups
  - content-addressed deduplication
  - local / S3 / B2 / SFTP storage
  - local / Google Drive / OneDrive / SFTP sources
  - restore to ZIP or directory
One thing I wanted to get right was portable drives. If the same external SSD moves between machines, the tool uses its GPT partition UUID to keep the backup history tied to the drive itself, instead of treating every new mount path as a different source.

Recent posts:

  - https://blog.cloudstic.com/2026/03/12/backing-up-portable-drives/
  - https://blog.cloudstic.com/2026/03/16/practical-backups-with-cloudstic-profiles/
Would love feedback

Comments URL: https://news.ycombinator.com/item?id=47398576

Points: 1

# Comments: 0

70M vectors searched in 48ms on a single consumer GPU –results you won't believe

16 March 2026 at 16:12

I built a prototype GPU-based vector search system that runs locally on a consumer PC.

Hardware:

RTX 3090 consumer CPU NVMe SSD

Dataset:

~70 million vectors (384 dimensions)

Performance:

~48 ms search latency for top-k results.

This corresponds to roughly ~1.45 billion vector comparisons per second on a single GPU.

The system uses a custom GPU kernel and a two-stage search pipeline (binary filtering + floating-point reranking).

My goal was to explore whether large-scale vector search could run efficiently on consumer hardware instead of large datacenter clusters.

After thousands of hours of work and many failed attempts the results finally became stable enough to benchmark.

I'm currently exploring how far this approach can scale.

I'm currently exploring how far this approach can scale.

I'd be very interested to hear how others approach large-scale vector search on consumer hardware.

Happy to answer questions.


Comments URL: https://news.ycombinator.com/item?id=47400954

Points: 1

# Comments: 4

❌
❌