Abstract
Research in quadrupedal robotics is transitioning to studies into loco-manipulation, featuring fully articulated robotic arms mounted atop these robots. Integrating such arms enhances the practical utility of legged robots, paving the way for expanded applications like industrial inspection and search and rescue. Existing literature commonly employs a six-degree-of-freedom (six-DoF) arm directly mounted to the robot, which inherently adds significant weight and reduces the available payload for manipulation tasks. Our study explores an optimized combination of arm configuration and control framework by strategically reducing the DoFs and leveraging the quadruped robot’s inherent agile mobility. We demonstrate that by minimizing the DoFs to just one, a range of canonical loco-manipulation tasks can still be accomplished. Some tasks even show improved performance with fewer robotic arm DoFs due to the higher torque motor used in the design, allowing more of the robot’s payload to be used for manipulation. We designed our optimized one-DoF robotic arm and the control framework and tested it on top of a Unitree Aliengo. Our design outperforms conventional six-DoF counterparts in lifting capacity, achieving an impressive 8 kg payload compared to the 2 kg maximum payload of industry-standard six-DoF robotic arms on the same quadruped platform.
1 Introduction
Wheeled mobile manipulators can address many in-door tasks effectively, yet it remains highly difficult to deploy these robots in outdoor environments with challenging terrains. Using legged robots to perform mobile manipulation offers distinct advantages compared to wheeled or tracked counterparts [1–4], as they can traverse diverse and challenging terrains, including climbing stairs or navigating rugged landscapes [5–8].
Legged robots exhibit multiple methods of manipulating objects in their environment, such as using their body or, in the case of humanoid robots, utilizing upper limbs to interact with the surroundings [9–12]. The conventional approach for constructing a legged mobile manipulator is usually inspired by wheeled mobile manipulators and typically involves attaching a high-degree-of-freedom (high-DoF), robot arm on top of a quadruped robot (see Fig. 1). This setup is easy to control due to the redundancy in degrees-of-freedom. However, this comes at a cost: the robot arm requires many motors, resulting in additional weight, while each manipulator motor delivers limited power. This combination thus significantly reduces the payload capacity of the entire robot. In addition, an advantage of legged robots over other types of robots is their ability to effectively control the orientation and position of the body, which also determines the orientation and position of the robot arm’s base, as the arm is attached to the robot body. Therefore, having a fully articulated arm on top of a legged robot is kinematically redundant.
The limitations of current approaches in the literature can be addressed by fully utilizing the robot’s mobile base and avoiding redundancy in degrees-of-freedom between the robot and its arm. For instance, the first joint of a typical articulated arm, which rotates the arm around the vertical axis, can often be replaced by simply yawing the robot’s body. Given the inherent capability of quadruped robots for performing pitching, yawing, and rolling motions of the arm, using a six-DoF arm results in suboptimal resource utilization due to increased controller complexity and additional arm weight, which reduces the available payload.
This paper addresses this challenge by developing a novel platform that combines a simple robotic arm design mounted on a quadruped robot paired with a control framework for dynamic loco-manipulation. The primary objective is to enable the dynamic manipulation of heavy objects without compromising the necessary degrees-of-freedom for performing practical tasks. Our approach optimizes the controller by considering whole-body dynamics and leveraging the reduced degrees-of-freedom in the arm, while maintaining effective loco-manipulation task performance. Compared to locomotion controllers for quadruped robots, our solution incorporates the desired trajectory of the manipulated object, calculating the necessary actions for the robot’s arm to successfully complete the task. The controller has full authority over the robot’s pose, including the position of the center of mass (CoM) and body orientation, while also considering the kinematic constraints imposed by the new arm design.
The key focus is on hardware and control co-optimization to augment capabilities and achieve an arm design that surpasses the efficiency of replicating an industrial robotic arm atop the robot. This led to investigating the required degrees-of-freedom, end effector forces, and speeds to perform specific loco-manipulation tasks. This mapping of manipulator parameters to tasks informed the creation of a final design for a one-DoF arm, demonstrating proficiency in accomplishing most loco-manipulation tasks. A physical prototype was fabricated for validation through real-world experiments.
The main contributions of this paper are as follows:
We conducted an in-depth investigation into manipulation tasks that can be performed by a quadruped robot and analyzed the hardware prerequisites crucial for an efficient robotic platform design.
We developed an efficient quadruped manipulator platform with a minimal-degree-of-freedom arm to cover the essential range of motion (RoM) while maximizing the lifting force and velocity of the end effector.
Developed a novel control framework specifically designed for quadruped-based loco-manipulation, enabling interaction with heavy objects. By incorporating planning and pose optimization, the system can complete tasks that require adjusting the robot’s pose to efficiently utilize all degrees-of-freedom.
Our system, which integrates control and design co-optimization, outperforms conventional approaches in terms of payload capacity. While the maximum payload of a system using an industry-standard six-DoF robotic arm is only 2 kg, our approach enables an 8 kg payload on the same quadruped platform.
2 Related Works
Researchers have increasingly explored mobile manipulation with quadruped robots in recent years. The combination of agile locomotion and manipulation capabilities opens exciting opportunities for legged robots in various real-world applications, from search and rescue missions to industrial automation. One approach to enable manipulation with quadruped robots involves leveraging the robot’s body or existing limbs, albeit confining the manipulation to non-prehensile actions [13–15]. This method typically facilitates interaction with heavy objects that would be challenging for a robotic arm due to limited end effector force and the necessity to maintain the combined center of mass within the legged robot’s support region. Another strategy is to modify the existing legs, creating a hybrid appendage capable of both locomotion and grasping small objects [16,17].
However, the most common method for performing manipulation tasks is to mount a dedicated robotic arm on the legged robot, increasing the system’s complexity with additional joints and weight [18–20]. Since these dedicated appendages are not used for locomotion, they can be specialized for fine sensing, grasping, and manipulation or designed to lift heavier objects. Nonetheless, equipping a legged robot with a dedicated arm presents challenges, such as maintaining stability and ensuring effective cooperation between the arms and legs.
Legged platforms offer advantages over wheeled ones, particularly in arm manipulation capabilities, such as reachability and manipulation speed. This is because legged platforms typically possess more degrees-of-freedom, allowing finer control over the base’s position and orientation. Zimmermann et al. [21] combined Boston Dynamics’ Spotmini with a lightweight robotic arm and gripper, demonstrating dynamic grasping tasks using a feedforward controller.
A legged robot equipped with an arm can dynamically manipulate objects on challenging terrains, including stairs and ladders. For instance, Ferrolho et al. [22] developed a control framework that enables ANYmal-B to perform pick-and-place tasks on rough terrain. Building on this, Ma et al. [23] integrated a learning-based locomotion policy with a model-based manipulation controller, allowing ANYmal-C to maintain a desired end effector position while navigating difficult landscapes.
Moreover, recent research has approached legged platforms as whole-body optimization problems. For instance, ANYmal-B leverages online hierarchical optimization for reactive human–robot collaboration and body posture optimization to enhance arm manipulability [24]. Ewen et al. [25] enhanced their work by transitioning from fixed contact force to online optimized force trajectory, ensuring dynamic feasibility and stability. Sleiman et al. [26] introduced a holistic model predictive controller (MPC) framework integrating dynamic locomotion and manipulation for ANYmal-C. Additionally, Chiu et al. [27] extended the whole-body MPC controller to generate collision-free motions while coordinating locomotion and manipulation. In their recent work, Sleiman et al. [28] incorporated an offline planner into a unified nonlinear loco-manipulation framework, enabling the execution of user-defined tasks by leveraging a library of predefined interactions and environment models.
3 Loco-Manipulation Tasks and Manipulators Design Analysis
In this section, we establish the rationale behind optimizing the DoFs in the robotic arm mounted on the quadruped platform. We undertake a comparative analysis of various designs against the state-of-the-art six-DoF arm, evaluating their effectiveness in accomplishing canonical tasks. Our discussion of loco-manipulation tasks encompasses six key activities that typify the spectrum of interactions expected from a robot manipulator.
3.1 Loco-manipulation Tasks Definition.
The initial task, detailed in Table 1, involves precision manipulation activities commonly encountered in industrial settings, such as turning and closing safety valves. This task requires a wide range of motion for the robotic arm and is often the most demanding. Subsequently, we consider tasks involving lifting and transporting payloads. Current quadruped robots equipped with fully articulated robotic arms face limitations in this area due to the arm’s considerable weight, which reduces the available payload capacity. Furthermore, accommodating multiple joints and precise end effectors imposes constraints on the torque each motor can provide, given the weight limitations. To enable autonomous operation in diverse human-made environments, a robot must interact with doors and traverse them. The last three tasks, though more specialized, hold significance in specific scenarios. These include dynamic tasks like object tossing or activities requiring rapid motions, where joint speed becomes a limiting factor. Payload pushing and pulling, along with tasks involving human–robot interaction, where a highly articulated arm can better adapt to the unpredictable motion of a human, round off our defined loco-manipulation tasks.
Six-DoF arm | Three-DoF arm | One-DoF arm | |
---|---|---|---|
Motor mass | 405 g | 605 g | 605 g |
Joint torque | 33 N m | 33 N m | 33 N m |
Joint speed | |||
Total arm mass | 4.3 kg | 2.5 kg | 1.5 kg |
Precise manipulation | Fully capable | Fully capable | Limited by robot roll RoM |
Heavy payload lifting and transporting | Limited at 2 kg payload | Maximum payload 5 kg | Maximum payload 8 kg |
Door interaction | Fully capable | Fully capable | Limited by robot roll RoM |
Dynamic tasks like object tossing | Limited in end effector speed | Fully capable | Fully capable |
Heavy payload pushing/pulling | Limited in payload | Limited in payload | Fully capable |
Human–robot interaction | Easy to plan | Harder to plan | Harder to plan |
Six-DoF arm | Three-DoF arm | One-DoF arm | |
---|---|---|---|
Motor mass | 405 g | 605 g | 605 g |
Joint torque | 33 N m | 33 N m | 33 N m |
Joint speed | |||
Total arm mass | 4.3 kg | 2.5 kg | 1.5 kg |
Precise manipulation | Fully capable | Fully capable | Limited by robot roll RoM |
Heavy payload lifting and transporting | Limited at 2 kg payload | Maximum payload 5 kg | Maximum payload 8 kg |
Door interaction | Fully capable | Fully capable | Limited by robot roll RoM |
Dynamic tasks like object tossing | Limited in end effector speed | Fully capable | Fully capable |
Heavy payload pushing/pulling | Limited in payload | Limited in payload | Fully capable |
Human–robot interaction | Easy to plan | Harder to plan | Harder to plan |
Note: The three-DoF arm has three joints: a revolute joint at the base of the arm in the same direction of the robot pitch, a prismatic joint along the axis of the first link, and a revolute joint at the gripper in the roll direction of the robot, to compensate for the limited roll of the robot. The one-DoF arm only has the revolute joint at the base of the arm.
3.2 Analysis of Manipulator Design.
Considering these factors, we aim to compare the capabilities of three distinct robotic arm configurations: the fully articulated six-DoF arm, one with three DoFs, and the last with only one DoF. Lowering the degrees-of-freedom offers the advantage of reducing the arm’s weight while potentially increasing the torque/speed of each motor. However, the trade-off lies in the potential limitation of manipulation capabilities with fewer DoFs. Yet, when mounted on a highly mobile legged robot, we intend to leverage its mobility to compensate for the reduction in DoFs and restore lost manipulation capabilities.
Examining Fig. 2, we compare the RoM of a fully articulated arm to that of a one-DoF arm. In the sagittal plane, where the one-DoF arm operates, we can see that its reach is smaller compared to the fully articulated arm in its nominal configuration. However, depending on the manipulation task, one could extend the arm length at the expense of lower payload capability. With this consideration, we optimized the arm length to maximize payload while retaining sufficient reach for objects in front of the robot. The final arm length used for comparison is 0.36 m. A fully articulated arm can effortlessly reach all four points in space around the robot without requiring any movement from the robot. If the arm is mounted on top of the body and can rotate fully around a vertical axis, its workspace can approximate a sphere and fully encompass the robot’s body. In contrast, our one-DoF design can reach the same points, but it requires utilizing the robot’s degrees-of-freedom. This observation indicates that the arm’s additional degrees-of-freedom are redundant when paired with the robot’s mobility, as a similar RoM can be achieved without them. This principle guides our exploration and development of an efficient robotic arm design. The one-DoF arm represents the extreme opposite of the fully articulated arm, with inherent limitations but also significant efficiency and payload advantages.
Moreover, the statically stable workspace of a quadruped robot equipped with a six-DoF arm might constitute a smaller subset of the complete range of motion of the arm. This consideration arises due to the need to account for a significant shift in the center of mass, especially when the arm is heavy or holds a substantial object. In such instances, the control framework may necessitate an additional layer that factors in the configuration of the legs to prevent the robot from tipping over. This limitation in the usable range of motion becomes even more critical when the arm operates on the side of the robot.
The primary challenge with the one-DoF design lies in the roll motion of the end effector. While pitch and yaw motion compensation is feasible through robot poses, the roll motion is confined to deg. This limitation delineates the specific tasks achievable by this design, as outlined in Table 1.
To overcome this limitation, we explored a three-DoF configuration. This design features an additional prismatic joint on the first link to extend its length and improve reach, along with an extra revolute joint at the gripper to enhance the roll range of motion. Although these added degrees-of-freedom reduce the payload capacity, they remove the end effector’s range of motion restrictions present in the one-DoF design.
Comparing the payload capabilities of the three arm designs reveals a significant difference. The fully articulated arm data are derived from the Unitree Z1 arm, specifically designed for the Aliengo, as detailed in Table 1. Weighing 4.3 kg with motors capable of exerting 33 N m torque, its joint speed is limited to . This limitation impacts payload-related tasks, limiting the lifting capacity to 2 kg and dynamic tasks due to the constrained joint speed. In contrast, our one-DoF design, featuring a single Unitree A1 high-torque motor at the base and lightweight construction, boasts a lifting capacity of up to 8 kg and excels in dynamic tasks. The three-DoF design strikes a balance, trading off some payload capacity to achieve superior and more versatile precise manipulation capabilities. When evaluating tasks that require precise manipulation or inherently need more degrees-of-freedom for successful completion, the one-DoF design is generally limited by its restricted roll range of motion. This constraint is especially noticeable in tasks such as door interaction, where the ability to manipulate handles and achieve sufficient roll motion is crucial. However, due to the optimized design, the one-DoF arm can open heavier doors more quickly, effectively balancing the limitations in its range of motion.
We opted to fabricate the one-DoF arm design after evaluating the task capabilities. This decision was driven by its superior compromise between payload capacity and the ability to accomplish tasks. Importantly, it serves as an ideal platform to showcase our optimized controller design, highlighting the effectiveness of our approach in achieving a balance between task versatility and payload capabilities.
4 Design and Fabrication of One Degree-of-Freesom Manipulator
This section outlines the mechanical design of the one-DoF arm utilized in our experiments and its integration into the Unitree Aliengo platform. This quadruped robot has 3 degrees-of-freedom per leg and weighs 21.5 kg. The body measures 0.65 m by 0.31 m, with the leg links measuring 0.25 m each, leading to a standing height of around 0.4 m. The design of the arm is visualized in Fig. 3 2 together with the exploded view highlighting the main components of the arm. The design criteria guiding our selections primarily followed three objectives:
Minimizing total mass: ensuring the arm’s lowest possible total mass to maximize payload capacity, considering the robot’s load limitation of 13 kg.
Single motor operation: enabling the lifting of the maximum payload with a single motor, thereby determining the motor’s torque requirement and the maximum length of the link.
Optimizing reach: attaining a minimum reach for the end effector to execute tasks such as door opening while acknowledging the constrained reach capacity compared to the six-DoF robotic arm.
The single degree-of-freedom is achieved through a revolute joint located at the base of the arm, aligned with the robot’s pitch. This configuration allows the gripper at the end effector to move within the robot’s sagittal plane. The arm’s link length was optimized to enable the manipulation of sufficiently large objects, maintaining an appropriate distance from the robot’s head and facilitating interaction with average-height door handles. Additionally, our control framework allows for pose adjustments to enhance the end effector’s vertical reach.
The joint is powered by a Unitree A1 motor, which provides a maximum torque of 33 N m and a maximum speed of . Simulations showed that this torque capacity is sufficient to lift the Aliengo’s maximum payload, eliminating the need for additional reduction gears, which would have decreased the maximum speed. The motor is securely mounted to the robot, with the first link attached to it. To reduce weight and simplify fabrication, we used two parallel laser-cut aluminum plates fastened together with bolts to accommodate the gripper. The gripper comprises two 3D-printed parts—one fixed and one movable—with the movable part operated by a high-torque servo motor.
The system integration of the robotic arm with the Aliengo platform is depicted in Fig. 4. Communication between the A1 motor and the onboard controller on the robot occurs through a serially connected Teensy 4.1 development board. This board receives motor commands from the high-level controller, which computes and translates them into messages transmitted to the A1 motor via a UART (Universal Asynchronous Receiver/Transmitter) protocol to RS485 communication board. The Teensy microcontroller decodes messages from the motor controller, relaying the current motor states—position, velocity, and torque—to the onboard controller, providing essential feedback. Additionally, the Teensy sends open or close commands to the gripper’s servo motor. Operating at a frequency of 1 kHz, the Teensy ensures seamless communication, matching the high-level controller on the onboard computer. Power for the A1 motor is supplied by a 24 V Li-Po battery positioned atop the robot, which also powers the gripper’s servo motor through a 6 V converter. The result of the fabrication and integration of the arm on the Unitree Aliengo quadruped robot is shown in Fig. 3.
5 Control System Overview
In this section, we introduce the architecture of our control system, depicted in Fig. 4, which forms the foundation of our proposed whole-body coordination framework. Loco-manipulation problems typically involve nonlinear models due to the complex interactions between the robot and the object. However, solving nonlinear optimization problems poses computational challenges and demands substantial onboard processing capabilities for the robot. To avoid this limitation, we decompose loco-manipulation problems into three elementary sub-problems that operate hierarchically. This approach approximates the nonlinear dynamics and allows our controller to handle heavy objects dynamically.
To enhance the dynamic capabilities of the legged manipulator, we lowered the degrees-of-freedom of the arm. This required a novel controller to maximize the capabilities of our manipulator design, capitalizing on the co-optimization we achieved. Notably, due to the robotic arm’s lower degrees-of-freedom, the end effector pose is directly linked to the robot’s pose. This contrasts with a conventional six-DoF arm, where the end effector pose remains independent of the robot’s pose. With our controller, we leverage the entire body’s motion and exploit the enhanced capabilities of the arm, thus overcoming limitations imposed by the arm’s lower DoFs.
In the following subsections, we provide a brief overview of each component. For more technical details, please refer to our previous work utilizing our framework in Ref. [29].
5.1 Planner for Object Manipulation.
The first component of our framework focuses on computing the necessary manipulation actions to achieve predefined manipulation tasks. We employ a linear MPC structure that shares the same control horizon as the final robot controller, ensuring seamless coordination between manipulation and locomotion. Each task specifies object states and manipulation forces, considering various dynamics and commands. Reference values for object states are integrated into the MPC’s cost function to minimize deviations from these references. We apply tailored constraints to ensure task-specific requirements, like maintaining a perpendicular orientation to a door surface while opening it.
The desired manipulation force is utilized in the quadruped loco-manipulation MPC and pose optimization for whole-body coordination. In the former, it is a known quantity representing robot–object interaction, while in the latter, it informs arm joint torque optimization to minimize manipulation effort. The optimized object trajectory guides locomotion in executing the desired manipulation task effectively.
5.2 Pose Optimization for Coordinated Loco-Manipulation.
This section details our pose optimization formulation, which links the object manipulation planner and the whole-body loco-manipulation controller. The primary objective is translating computed manipulation forces and optimal object states into robot-centric states and dynamics. To achieve this, we perform optimized pose computations for the robot at each MPC horizon, considering manipulation force and system kinematic constraints. Our approach is inspired by the work in Ref. [30], with the necessary modifications to adapt it to real-time execution while maintaining meaningful constraints.
Starting with the cost function (Eq. (1)), the objectives include minimizing the difference in CoM height from the reference value, minimizing body rotations, and minimizing arm torques for manipulation, computed using the arm’s contact Jacobian . Appropriate matrices weigh these objectives to tailor the robot’s behavior during manipulation. Constraints (Eqs. (2)–(6)) ensure that hip locations, body orientation, arm joint angles, end effector pose, and manipulation forces align with the optimization goals and manipulation planner outputs.
In particular, we constrain the vertical position of the four hips of the robot to ensure we can perform effective stepping with each leg. Then, we constrain robot orientation to physically feasible values, particularly concerning pitch and roll, where we use limits of deg and deg, respectively. The next constraint enforces limits on arm joint angle to avoid self-collision with the robot body. The last two constraints establish the connections between pose optimization and the MPC manipulation planner. Equation (5) ensures that the end effector’s pose matches the optimized object states computed by the manipulation planner. The specifics of this constraint depend on the particular task at hand; for instance, when lifting an object, it constrains the end effector’s position, whereas for turning a door handle, it also includes the end effector’s orientation. Equation (6) requires that the manipulation force in the optimization variable aligns with the force computed by the manipulation planner. Pose optimization is executed at every MPC horizon, providing reference trajectories for the robot’s states for the whole-body loco-manipulation MPC.
5.3 Whole-Body Loco-Manipulation Model Predictive Controller.
In this section, we explain the role of the whole-body loco-manipulation MPC controller, which efficiently commands the robot along its defined trajectory obtained from pose optimization. This controller is adept at handling the influence of manipulation forces on the robot’s dynamic stability, a critical aspect when dealing with heavy objects. While the baseline locomotion MPC for quadruped robots, as introduced in Ref. [5], is designed to handle minor external disturbances to the robot’s dynamics, it falls short when dealing with heavy objects. These disturbances need explicit consideration in the robot’s dynamics.
In Eq. (7), represents the combined mass of the robot and arm, is the gravity vector, denotes the ground reaction force acting on foot , and is the external manipulation force. In Eq. (8), the derivative of the angular momentum is simplified, retaining only the term , and is the moment of inertia of the robot in the world frame, is the position vector from the robot’s CoM to the foot location, and is the position vector from the robot’s CoM to the gripper location.
In the problem’s cost function, shown in Eq. (13), represents the reference values for the robot’s states, calculated through pose optimization. These reference values are critical in coordinating full-body motion to accurately perform loco-manipulation tasks. Equation (14) outlines the dynamic constraints for each prediction horizon, with and serving as discrete-time counterparts to the matrices described in Eqs. (11) and (12). Additionally, Eq. (15) imposes frictional pyramid constraints for each leg, while Eq. (16) ensures the elimination of reaction forces on swing legs. Equation (17) acts as an equality constraint, aligning the manipulation force in the dynamics with the desired manipulation force from the MPC manipulation planner for each prediction horizon.
The controller optimizes ground reaction forces while considering the manipulation force at every prediction horizon, facilitating precise tracking of desired state trajectories generated by the pose optimization.
6 Results
In this section, we present the results we obtained in both simulation and real hardware to prove the effectiveness of our design. Simulations were performed using the Simscape multibody package in matlab Simulink, a high-fidelity environment that accurately simulates contact-rich scenarios. The results of our study emphasize two key capabilities relevant to canonical manipulation tasks:
High payload handling: Demonstrated through the ability to exert substantial forces at the end effector, our findings showcase the arm’s capability for lifting and carrying heavy payloads.
Precise end effector control: Highlighted through tasks like door opening, this capability underscores the arm’s ability to precisely control the pose and orientation of the end effector, where force application is not a limiting factor.
Our control architecture, described in Sec. 5, operates efficiently on the robot’s onboard computer. The low-level controller runs at 1 kHz, while the loco-manipulation MPC, with a time horizon of and horizons, operates at 30 Hz. We employ CasADi and the Ipopt solver to tackle the NLP pose optimization problem. This configuration ensures that our framework is capable of real-time execution, offering the potential for dynamic and agile loco-manipulation tasks while accounting for the significant effects of high payloads on robot dynamics. We show the effects of the individual elements of the control framework, presented in Sec. 5, in simulation compared to baseline controllers. Then, we show the capabilities of the whole framework with a task consisting of lifting and carrying the maximum payload possible, comparing the three different manipulator designs presented in Sec. 3. After, we show the implementation on the real robot equipped with our design of the one-DoF robotic arm carrying a weight equal to %35 of the robot weight. To conclude, we show an example of a door opening in simulation and with the real robot.
6.1 Simulation.
First, we want to highlight the importance of considering the object in the robot MPC model, especially when the object is heavy. To do this, we compare the loco-manipulation MPC in our approach to a baseline locomotion MPC. We perform this task in simulation and during the task the robot picks up a 3 kg object from the ground using the arm DoF and lifts it to a predefined arm angle in 2 s. We can see snapshots of the task and CoM height and robot pitch tracking with the two controllers in Fig. 5. Using the proposed loco-manipulation MPC, the robot can successfully lift the heavy object from the ground and maintain the desired height and pitch for the robot throughout the entire task. Instead, the baseline MPC cannot follow the desired quantities and struggles to maintain balance. In fact, with objects heavier than 3 kg, the baseline MPC would fail, while our proposed controller can still handle them. Due to the mass-efficient arm design, we have a maximum payload capability of 8 kg, which is almost 40% of the robot weight. Our payload-to-robot weight ratio is higher than other robotic arms used in the state-of-the-art legged loco-manipulation research.
The next component we want to showcase is the pose optimization for coordinated loco-manipulation. This part of the approach is crucial because it links the output of the object manipulation planner to the loco-manipulation MPC. In Fig. 6, we can see various solutions that the pose optimization block can compute to solve the same task based on the choice of weights for the objectives in the cost function. We can consider the pose that tracks CoM height and pitch and only uses the arm to reach the object as the baseline. Then, if we reduce the weight related to the CoM height in the pose optimization cost function, we obtain the pose in Fig. 6(b), where the controller trades the robot CoM height to reach the object with the gripper. Similarly, in Fig. 6(c), we can see the effect of reducing the weight related to robot body rotation. A balanced choice of weights in the cost function gives us a pose that uses all possible DoFs of the system to reach the object with the gripper, as illustrated in Fig. 6(d). All these starting poses can complete the task of lifting the object to a predefined height.
Next, to demonstrate the comprehensive capabilities of the control framework, we subjected the robot to a series of tasks. The initial task involved executing a predefined trajectory, which consisted of moving straight, executing a 90-degree turn on the spot, and resuming forward movement. We tested with increasing payload mass, and the robot was able to lift and carry up to 8 kg. The failure modes after this amount of payload were twofold: the first one was the robot tilting forward when lifting the object from the ground due to the considerable shift in the center of mass and the combined center of mass going out of the support region of the legs; the second one was a combination of the torque limit on the arm motor and the torque limit reached by the knee motor of the legs when trotting and having a stance on only two legs. We also tested the same task with the six-DoF arm with the data taken from the Unitree Z1 arm; in this case, the maximum payload was, as reported by the manufacturer, around 2 kg. The failure mode was due to the torque limit reached on the arm motors when lifting the payload from the ground. In this case, the knee joints on the robot’s legs were still below the torque limit, meaning that the robot could transport more load but was limited by the arm hardware capabilities.
The subsequent three tasks, with snapshots depicted in Fig. 7, were devised to show how the co-optimization of hardware and controller enables us to overcome the inherent limitation of having only one degree-of-freedom for the arm.
In the first task, as illustrated in Fig. 7(a), we aim to demonstrate the robot’s ability to adjust its yaw for effective manipulation while still grasping the payload. Positioned in a narrow corridor where it cannot face the box directly, the robot relies on the coordinated operation of its three subsystems. The controller dynamically adjusts the yaw to reach the box handle without colliding with the corridor walls. Once the object is grasped, the robot uses the arm to lift the box, resets the yaw to zero, and moves forward.
Conversely, in the second task depicted in Fig. 7(b), we aim to showcase the robot’s ability to perform precise manipulation through pose adjustment with roll. The controller coordinates a roll maneuver to manipulate the valve, considering the constraints imposed by having only one degree-of-freedom. By iteratively executing this movement, the robot successfully rotates the valve despite having a single degree-of-freedom, a feat comparable to a six-DoF arm that would achieve the task without changing the gripping point.
The final task we tested involved opening a heavy, resistant hinged door. Figure 7(c) shows snapshots of this task successfully completed in simulation. Within the manipulation planner, we defined subtasks such as turning the door handle and pushing the door open. The pose optimization coordinates the robot’s motion to align with these manipulation actions. Importantly, the optimization process accounts for potential collisions with the door frame, allowing for the generation of feasible, collision-free trajectories for the robot’s CoM. One key advantage of our approach, with its distinct hierarchical components working together, is the ability to incorporate additional constraints into the problem without significantly increasing complexity and computational overhead.
6.2 Hardware Experiments.
Similar to our simulation results, we aimed to demonstrate the difference in lifting capabilities by employing the fabricated one-DoF arm mounted on the Unitree Aliengo, utilizing our proposed controller. In Fig. 8, we illustrate how the difference observed in the real robot between a baseline MPC and our control framework is further accentuated. Here, the baseline MPC encounters failure with a payload exceeding 3 kg, as evident from the snapshots where the robot struggles to maintain balance. Conversely, we effortlessly lift and transport 3 kg with the proposed controller, reaching the robot’s limit of approximately 7 kg. These findings corroborate the outcomes observed in the simulation.
We opened a resistive hinged door with the real robot, and the motion snapshots can be seen in Fig. 9. In this case, we used the loco-manipulation MPC with manual commands from a joystick to coordinate the movement and complete the task. The external force coming from the manipulation is considered a known quantity and fixed in the controller.
7 Conclusions
This paper presents a practical approach to deal with the challenges of loco-manipulation in legged robots. By co-optimizing the arm’s degree-of-freedom and the control framework, we can take advantage of the high mobility of legged robots and design a novel robotic arm that maximizes the available payload for manipulation. With the simultaneous mechanical design and controller optimization approach, we can perform dynamic tasks where the object significantly impacts the robot’s stability. We have broken the problem into three components in a hierarchical approach that works together effectively. Our approach has been validated through both numerical simulations and experimental scenarios. We demonstrate how our whole control framework overcomes the limitations of having only one degree-of-freedom in the arm. We have applied our control framework to a Unitree Aliengo robot equipped with the optimized one-DoF robotic arm, and it has accomplished tasks such as lifting and carrying an 8 kg object and opening a resistive hinged door. Future directions of investigation arising from this project are twofold: focusing on enhancements related to the arm and its utilization and improvements related to the controller. For the former direction, we need to include better localization and environment sensing to replicate complex tasks like the ones shown in the simulation. Integrating vision or a Lidar system would allow us to interact with the environment in an automated way. Furthermore, this framework opens avenues for introducing teleoperation to govern the end effector position of the arm. On the controller front, ongoing efforts center on developing a novel unified formulation integrating the arm angle and manipulation forces directly into the model predictive controller, while retaining a linear dynamic formulation. This addition extends the system’s capabilities to tackle even more complex tasks.
Footnote
Conflict of Interest
There are no conflicts of interest.
Data Availability Statement
The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.