Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Open to Collab
19
6
24
AbstractPhila
PRO
AbstractPhil
Follow
manh-linh's profile picture
LiteMind's profile picture
recoilme's profile picture
85 followers
·
124 following
https://civitai.com/user/AbstractPhila
AbstractEyes
AI & ML interests
datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.
Recent Activity
updated
a model
about 2 hours ago
AbstractPhil/geolip-aleph-void
reacted
to
OzTianlu
's
post
with 🧠
about 17 hours ago
ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight? Read online: https://datawhalechina.github.io/learning-terrain/ I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0). The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks: ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step. GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies. DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through. KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat. Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem. Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy. The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning. GitHub: https://github.com/datawhalechina/learning-terrain Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2 Convergence is not hope. Convergence is geometry. You see.
replied
to
OzTianlu
's
post
about 17 hours ago
ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight? Read online: https://datawhalechina.github.io/learning-terrain/ I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0). The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks: ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step. GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies. DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through. KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat. Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem. Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy. The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning. GitHub: https://github.com/datawhalechina/learning-terrain Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2 Convergence is not hope. Convergence is geometry. You see.
View all activity
Organizations
AbstractPhil
's datasets
71
Sort: Recently updated
AbstractPhil/diffusion-pretrain-set-ft1-1024
Viewer
•
Updated
4 days ago
•
1.14M
•
653
AbstractPhil/sdxl-qwen-phase1-cache
Viewer
•
Updated
8 days ago
•
86k
•
706
AbstractPhil/geolip-sdxl-fid-scoring
Viewer
•
Updated
9 days ago
•
2.8k
•
52
AbstractPhil/sdxl-qwen-phase0
Viewer
•
Updated
10 days ago
•
86k
•
1.1k
•
3
AbstractPhil/diffusion-pretrain-set-ft1
Viewer
•
Updated
18 days ago
•
1.29M
•
4.99k
•
1
AbstractPhil/IMDB-PUBLIC-SCRAPED
Preview
•
Updated
26 days ago
•
55
•
1
AbstractPhil/ldhnam-deepfashion_controlnet
Viewer
•
Updated
26 days ago
•
26k
•
35
AbstractPhil/ffhq_flux_latents_repaired
Viewer
•
Updated
26 days ago
•
40.8k
•
596
AbstractPhil/synthetic-characters
Viewer
•
Updated
26 days ago
•
149k
•
1.98k
AbstractPhil/CN_pose3D_V10_512
Viewer
•
Updated
26 days ago
•
66.5k
•
344
AbstractPhil/CN_pose3D_V7_512
Viewer
•
Updated
27 days ago
•
255k
•
732
AbstractPhil/synthetic-object-relations-json
Viewer
•
Updated
27 days ago
•
5k
•
156
AbstractPhil/cc-task1-json
Preview
•
Updated
27 days ago
•
1.07k
AbstractPhil/cc-prompts-sharded
Updated
about 1 month ago
•
196
AbstractPhil/json-coco-format
Viewer
•
Updated
about 1 month ago
•
129k
•
328
AbstractPhil/svae-freckles-4096-cifar10
Viewer
•
Updated
Apr 10
•
60k
•
55
AbstractPhil/ryan-spearman-prepared-features
Viewer
•
Updated
Mar 27
•
1
•
412
AbstractPhil/conceptual-captions-12m-webdataset-berts
Viewer
•
Updated
Mar 20
•
32.3M
•
4.45k
•
1
AbstractPhil/bertenstein-v1
Viewer
•
Updated
Mar 7
•
37.4k
•
2.51k
AbstractPhil/residual-thinking-embeddings
Updated
Mar 3
•
155
AbstractPhil/synthetic-object-relations
Viewer
•
Updated
Jan 28
•
100k
•
1.2k
•
1
AbstractPhil/imagenet-synthetic
Viewer
•
Updated
Jan 24
•
30k
•
2.4k
AbstractPhil/ffhq_with_llava_shorter_captions_flux_latents
Viewer
•
Updated
Jan 23
•
40.8k
•
208
AbstractPhil/flux-schnell-teacher-latents
Viewer
•
Updated
Jan 22
•
121k
•
145
AbstractPhil/bulk-coco-features
Viewer
•
Updated
Dec 28, 2025
•
4.19M
•
487
AbstractPhil/dataset-code-test
Viewer
•
Updated
Dec 28, 2025
•
120
•
47
AbstractPhil/foldl-midi
Preview
•
Updated
Nov 8, 2025
•
13
AbstractPhil/sd15-latent-distillation-500k
Viewer
•
Updated
Nov 7, 2025
•
734k
•
108
AbstractPhil/bulk-sd15-feature-extract
Viewer
•
Updated
Oct 26, 2025
•
100
•
16
AbstractPhil/imagenet-clip-features
Viewer
•
Updated
Oct 2, 2025
•
7.16M
•
767
•
1
Previous
1
2
3
Next