Skip to content
View RUFFY-369's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report RUFFY-369

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
RUFFY-369/README.md

Kon'nichiwa! 👋

I’m an ML Research Engineer focused on distributed training infrastructure, multi-agent orchestration, and production-grade inference.

By day, I architect high-throughput AI compute pipelines and optimize complex models across decentralized networks of heterogeneous GPUs. On the open-source side, I'm a core contributor building stateful distributed backplanes and massive-scale RL infrastructure.

My work bridges research and low-level systems;whether that means scaling distributed RL (GRPO/PPO), eliminating I/O bottlenecks in training clusters, or compiling multimodal backbones into TensorRT for continuous inference.

When I’m not scaling clusters or compiling engines, you’ll find me listening to Shoegaze, tracking astronomy, playing badminton, or leaning into my Otaku side.

Thanks for reading!

  • You can check out some sections of my work through the repos.

  • If you are interested to collaborate on some cool things like reinforcement learning or robotics, feel free to hit me up via email

Pinned Loading

  1. SAC_implementation SAC_implementation Public

    Implementation of Soft actor critic paper

    Jupyter Notebook 2

  2. DDPG_implementation DDPG_implementation Public

    This is an implementation of the 'Continuous control with Deep Reinforcement learning' paper

    Jupyter Notebook 1

  3. hermes-agent hermes-agent Public

    Forked from NousResearch/hermes-agent

    The agent that grows with you

    Python 3

  4. atropos atropos Public

    Forked from NousResearch/atropos

    Atropos is a Language Model Reinforcement Learning Environments framework for collecting and evaluating LLM trajectories through diverse environments

    Python

  5. torchtitan torchtitan Public

    Forked from NousResearch/torchtitan

    A PyTorch native library for large model training

    Python

  6. ros-hermes ros-hermes Public

    ROS2 Control. Hermes Evolving Intelligence. Synchronized. ִֶָ.💖 ࣪ ִֶָ🪽་༘

    Python 2