Deep learning has demonstrated considerable success embedding images and more general 2D representations into compact feature spaces for downstream tasks like recognition, registration, and generation. Learning from 3D data, however, is the missing piece needed for embodied agents to perceive their surrounding environments. To bridge the gap between 3D perception and robotic intelligence, my present efforts focus on learning 3D representations with minimal supervision. In this talk, I will discuss my opinions and experiences towards building 3D representations for autonomous driving and robotics. First, I will cover my PhD works that mainly target learning 3D representations from point clouds. Then, I will discuss our recent efforts using neural fields as representations for robotics and autonomous driving. I will cover both benefits and barriers of neural fields. Finally, the talk will conclude with a discussion about future inquiries to design complete and active 3D learning systems.