It's not possible to subtract registers within the address computation, the instructions (their encoding) don't support it.
The maximum you can do within the address is [register1 + N*register2 + constant] - where N can only be 1, 2, 4, or 8 (or 0 if the second register isn't there at all). It doesn't matter whether the code is 64bit or 32bit, this hasn't really changed.
Sure, you can do [rbp-88] - because that's actually [rbp + (-88)].
If your algorithm allows that (it's often possible to do in loops with the counter variable), you can negate the register value (in your example, via "neg rcx" instruction) somewhere in the beginning, and then work with the negative value. So, you can use mov rdi, [rax+rcx] to access the memory, but of course you need to reverse the other operations (so instead of "inc rcx", you use "dec rcx", you change "add rcx, N" into "sub rcx, N" - etc.)