❌

Reading view

There are new articles available, click to refresh the page.

Ask HN: Former master-tech building AI systems – how to break into software?

So to keep a very long story short, I was 28, was exactly a year ago from april i was awarded top 2000 technicaians in the Nation from Stellantis of North America. I was(and still am) dealing with life long knee pain and hip pain from just wear and tear of automotive repair. Due to this i knew automotive was over for me. Then I discovered claude.ai and one question led to another text thing you kmow im building a dual rtx msi b550 tomohawk max wifi 64gb ram, amd am4 5900xt cpu. 3tb ssd nvme ssd. Within 2 months of discovering the potential of CS. That said. I ended up creatinf a ton of projects to teach myself. Mind you im doing all this while the world around me is crashing and burning. I built printmakerai.com. I know ai is amazing at coding, and is very useful, but im the kind of guy who firmly believes "if you want it done right you must do it yourself." The reason im posting this is im seeking advice on what I should do. I cant land a career in cs without a formal degree. Im coming from one of the most technically advanced trades there are. We aren't "mechanics" anymore were technicians and for good reason. I can directly translate canbus, sensor fusion,.redundant systems, trace wiring diagrams,

Seperate circuits and modules and accurately Diagnose a module, circuit or more. Heck I was able to translate oil wicking to solder wicking. Like how 2018+ wranglers throw excessive dash lights and can stall? Everyone assumes abs module because the abs is throwing the dtc and there was a year plagued with bad abs modules. But its really a starconnector behind the glove box that has solder wicking down the pcb. And even being this smart is clearly not enough to land me a job in any real cs world. Thanks to claude.ai I found what makes brain light up like a Christmas tree but genuinely need help navigating this confusing field. That said ill be treaching myzelf python today because we'll I cant code myself lmao. And I know thats blocker number 1.


Comments URL: https://news.ycombinator.com/item?id=48076414

Points: 2

# Comments: 0

Building a Web Framework from Scratch

Draco is a Hack Club (https://hackclub.com) YSWS (You Ship We Ship) β€” teenagers build a working server side web framework from scratch. Ship it, and we send you a mechanical keyboard + SSD.

The idea came from building Beasty β€” my own HTTP server from raw TCP. The moment you parse your first request line by hand and a browser actually responds, something clicks. You stop thinking of HTTP as magic and start thinking of it as bytes. That's the feeling I want 50 teenagers to have.

The project has 6 milestones β€” from opening a TCP socket all the way to middleware and custom routing. Doable in a weekend if you're motivated, ~15 hours of focused work.

Draco site: [https://github.com/NewSmoke38/draco]


Comments URL: https://news.ycombinator.com/item?id=48001803

Points: 7

# Comments: 1

Ask HN: Will hardware ever be cheap again?

Up until about 2015 it felt like hardware was always getting cheaper. Then something happened and hardware stopped getting cheaper every year. Adjusted for inflation a mid-range laptop or desktop costs me much more today than it did back then. Yes, it has better specs but it needs it to make up for the extra bloat of all applications.

Given DRAM, CPU, GPU, and SSD shortages it does not seem hardware will become cheaper in the short term. Do you think it will ever go back to how it was ten years ago or is this the new normal?


Comments URL: https://news.ycombinator.com/item?id=47927899

Points: 30

# Comments: 8

Show HN: WayInfer – Native GGUF engine that runs models larger than your RAM

We built a native inference engine that runs quantized LLMs directly from SSD using memory-mapped I/O. The model never fully loads into RAM β€” the OS pages weights on demand as each layer executes.

*What it does:* - Mixtral 8x22B (80GB, 141B params) runs on a machine with 48GB RAM - Model loads in 0.3 seconds (vs 190s with llama.cpp) - Produces correct output: "What is 2+2?" β†’ "The sum of 2 and 2 is 4." - Zero dependencies β€” custom tensor engine, custom GGUF parser, no ggml/llama.cpp

*How it works:* - `mmap()` the GGUF file. The OS handles SSD→RAM paging transparently - Quantize the input to Q8_K, compute dot products directly against Q4_K/Q5_K/Q6_K weights in the quantized domain — no dequantization to float32 - AVX2 SIMD + 8-thread parallel matvec - For MoE models: only 2 of 8 experts are active per token, so most weights stay cold on disk

*The hard part we solved:* GGUF models are calibrated for a specific dot product computation path (ggml's "quantize input β†’ integer multiply-accumulate β†’ late float conversion"). If you naively dequantize weights to float32 and do a standard dot product, the per-operation error is tiny (~0.001%) but compounds across 56 transformer layers into completely wrong output. We had to reverse-engineer and match ggml's exact scalar computation β€” block-level integer accumulation with 8-lane parallel reduction β€” to get correct results.

*What it doesn't do (yet):* - Speed: ~0.08 tok/s on the 80GB model (CPU-only, no GPU offload) - No interactive chat UI - Only K-quant GGUF formats (Q4_K_M, Q5_K_M, Q6_K β€” covers ~90% of models on HuggingFace) - Windows only (Linux stubs exist but untested)

The architecture comes from my "work in progress" WayOS (https://github.com/cloudlinqed/WayOS), an AI-first OS that treats SSD/RAM/VRAM as a unified memory hierarchy.

GitHub: https://github.com/cloudlinqed/WayInfer


Comments URL: https://news.ycombinator.com/item?id=47614947

Points: 1

# Comments: 0

❌