The heat capacity \mathcal{C} of a given probe is a fundamental quantity that determines, among other properties, the maximum precision in temperature estimation. In turn, \mathcal{C} is limited by a quadratic scaling with the number of constituents of the probe, which provides a fundamental limit in quantum thermometry. Achieving this fundamental bound with realistic probes, i.e. experimentally amenable, remains an open problem. In this work, we exploit machine-learning techniques to discover optimal spin-network thermal probes, restricting ourselves to two-body interactions. This leads to simple architectures, which we show analytically to approximate the theoretical maximal value of \mathcal{C} and maintain the optimal scaling for short- and long-range interactions. Our models can be encoded in currently available quantum annealers, and find application in other tasks requiring Hamiltonian engineering, ranging from quantum heat engines to adiabatic Grover's search.