AbstractPhila PRO

AbstractPhil

https://civitai.com/user/AbstractPhila

AbstractEyes

AI & ML interests

datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.

Recent Activity

reacted to OzTianlu's post with 🧠 about 1 hour ago

ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight? Read online: https://datawhalechina.github.io/learning-terrain/ I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0). The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks: ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step. GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies. DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through. KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat. Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem. Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy. The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning. GitHub: https://github.com/datawhalechina/learning-terrain Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2 Convergence is not hope. Convergence is geometry. You see.

repliedto OzTianlu's post about 1 hour ago

posted an update about 19 hours ago

Claude Fable 5 was temp/perma? banned for security reasons. Working with Fable I have to say the model is capable at handling highly complex geometric mathematics ACTUALLY to the point of me getting some work done without a headache. I hope Fable returns soon so I can finish cobbling without a headache and a week per prototype again. During Fable's existence I managed to cobble together a multi-series aleph paradigm that can handle direct implicit and explicit learning for an LM with a trigram context window. This essentially provides expert directional utilization based on a stable codebook without requiring expert distillation into singular experts and duplicated. Details soon. There are over 20 functional formula prototypes and around 8 potential heads that all lead to the same outcome, the math is rock solid - each with their own benefits and downsides based on the assigned text tasks.

View all activity

Organizations

Posts 47

Post

Claude Fable 5 was temp/perma? banned for security reasons.

Working with Fable I have to say the model is capable at handling highly complex geometric mathematics ACTUALLY to the point of me getting some work done without a headache. I hope Fable returns soon so I can finish cobbling without a headache and a week per prototype again.

During Fable's existence I managed to cobble together a multi-series aleph paradigm that can handle direct implicit and explicit learning for an LM with a trigram context window. This essentially provides expert directional utilization based on a stable codebook without requiring expert distillation into singular experts and duplicated.

Details soon. There are over 20 functional formula prototypes and around 8 potential heads that all lead to the same outcome, the math is rock solid - each with their own benefits and downsides based on the assigned text tasks.