Autonomous bicycle control represents a challenging problem in underactuated nonlinear systems, requiring precise lateral balance and steering coordination under dynamic conditions. To address the limitations of traditional model-based approaches in parameter sensitivity and computational complexity, this paper presents a comprehensive deep reinforcement learning framework for robust lateral neural control of autonomous bicycles. Our approach eliminates the need for explicit dynamic modeling by training end-to-end neural policies through Proximal Policy Optimization (PPO) in high-fidelity NVIDIA Isaac Sim environments. The framework incorporates a carefully designed multi-objective reward function that simultaneously optimizes balance maintenance, velocity tracking, and steering precision. Systematic domain randomization strategies are implemented to bridge the simulation-to-reality gap, enabling successful transfer to hardware implementation with robust performance across diverse operational conditions. These results establish deep reinforcement learning as a viable paradigm for practical autonomous bicycle control, offering superior adaptability and robustness compared to traditional approaches. Video demonstrations of the proposed system can be available at https://anony6f05.github.io/CycleRL/.