This paper considers the problem of communication and computation-efficient distributed learning via a wireless fading Multiple Access Channel (MAC). The distributed learning task is performed over a large network of nodes containing local data with the help of an edge server coordinating between the nodes. The information from each distributed node is transmitted as an analog signal through a noisy fading wireless MAC, using a common shaping waveform. The edge server receives a superposition of the analog signals, computes a new parameter estimate and communicates it back to the nodes, a process which continues until an appropriate convergence criterion is met. Unlike typical Federated learning approaches based on communication of local gradients and averaging at the edge server, in this paper, we investigate a scenario where the local nodes implement a second order optimization technique known as Determinantal Averaging. The communication complexity at each iteration per node of this method is the same as any gradient based method, i.e. O(d) , where d is the number of parameters. To reduce the computational load at each node, we also employ an approximate Newton method to compute the local Hessians. Under the usual assumptions of convexity and double differentiability on the local objective functions, we propose an algorithm titled Distributed Approximate Newton with Determinantal Averaging (DANDA). The state-of-art first and second-order distributed optimization algorithms are numerically compared with DANDA on a standard dataset with least squares based local objective functions (linear regression). Simulation results illustrate that DANDA not only displays faster convergence compared to gradient-based methods, but also compares favourably with exact distributed Newton methods, such as LocalNewton.