In general, BFT is the ability of a system to reach a necessary consensus despite the presence of potentially faulty inputs.
The theories behind BFT stems from research into how critical systems could maintain reliability in the presence of potential faults. A classic example is the development of avionics in which increasingly complex systems became vulnerable to faulty components or even cosmic rays that could lead to catastrophic failures. In such cases, typical approaches to BFT were to use multiple sensors and voting systems (such as 2 out of 3) to try and minimise the probability of failure. Virtually all fault tolerance systems assume a strict majority, or supermajority, of nodes in the system are both honest and reliable.
BFT is more challenging in trustless distributed networks due to the presence of potentially malicious actors. One of the key innovations of Bitcoin was its approach to solving the Byzantine General’s Problem by using the economic incentives of Proof of Work (PoW) to encourage actors to maintain the network.
In theory, Proof of Stake (PoS) has even higher BFT that PoW systems since it uses both incentives and punishments to prevent malicious activity by actors.
There are other approaches to the Byzantine General’s Problem such as Algorand which attempts to provide BFT by randomly changing actors roles each round to prevent the potential for collusion.