An older text that may be hard to find is Hopcroft and Ullman s "Introduction to Automata Theory, Languages, and Computation". There are a number of editions --- I have heard that 79 is the best, in as much as it pulls the fewest punches in introducing complex stuff. It a legitimate, albeit small textbook, and it introduces the whole field, not just what you re looking for. I suggest this on the likely vain hope that maybe one of those "trickier" proofs other sources leave out may be your key.
As a gentler starting point, there are a few handy "benchmark" languages.
- If your model can recognize the language of all strings where there are the same number of As and Bs in a string, then it is at least more powerful than FSM.
- If it can t, then it may be equivalent to FSM.
- Similarly, if your model can recognize the language of all strings where the are the same number of As, Bs, and Cs in a string, then it is more powerful than a CFG, or PDA.
These "benchmark languages" are really just functions in disguise --- the first is basically asking if 2 unary numbers are equal, the second is asking if 3 unary numbers are equal. They are nice and simple, and are well-known to be above or below the capabilities of particular models. I don t know simple ones for the more complex machines, so you may be on your own.
Note that for the model "LBA", linearly bounded automata, I believe that there is no known natural language that is computable with a TM, but NOT an LBA. This statement is drawn from hazy memories, so don t take it as a formal proof. :)
Note (lastly) that the "benchmark" languages ARE NOT ESTABLISHING EQUALITY. That is to say, if your model cannot compare 2 unary numbers, that does not mean it is necessarily equivalent to a FSM, it could be even weaker. The languages would only establish what it is greater than in power, or less than in power.
On a completely (completely) different track, Sipser s book does do proofs of equivalence between regexes and FSM, as well as between PDAs and CFGs. I m not sure how helpful that will be, as you are quite vague on what sort of model of computation you re considering, but if you re dead-set on equivalence, those may be good starting points.