A 6.7-MFLOPS floating-point coprocessor with vector/matrix instructions | IEEE Journals & Magazine | IEEE Xplore