Highly variable clinical course, immune dysfunction and complex genetic blueprint pose challenges for treatment decisions and the management of risk of infection in patients with chronic lymphocytic leukemia (CLL). In recent years, the use of machine learning (ML) technologies has made it possible to attempt to untangle such heterogeneous disease entities. In this study, using three classes of variables: CLL-IPI variables, baseline (para)clinical data, and data on recurrent gene mutations, we built ML predictive models to identify the individual risk of four clinical outcomes, namely Death, Treatment, Infection, and the combined outcome of treatment or infection. Using the predictive models, we assessed to which extent the different classes of variables are predictive of the four different outcomes both within a short-term 2-year outlook and a long-term 5-year outlook post CLL diagnosis. By adding the baseline (para)clinical data to CLL-IPI variables predictive performance was improved, whereas no further improvement was observed when including the data on recurrent gene mutations. We discovered two main clusters of variables that are predictive of Treatment and Infection. Further emphasizing the high mortality due to infections in CLL, we found a close similarity between variables predictive of infection in the short outlook and those predictive of Death in the long outlook. We conclude that at the time of CLL diagnosis, routine (para)clinical data are more predictive of patient outcome than recurrent mutations. Future studies on modeling genetics and clinical outcome should always consider the inclusion of several (para)clinical data to improve performance.