(Quick T.J. and Brown H. 2018)
Taken from Fig 14 of Aids to Examination of the peripheral nervous system, Medical research council memorandum No 45 (superseding War Memorandum No7) Her Majesty’s publishers. 1976
The following is an introduction to the discipline of clinical assessment of motor recovery pertinent to the measurement of outcomes following re-innervation.
The clinical experience of nerve injury involves many aspects which are subjective in nature; not easily objectively gradable: One example is the experience of pain. It is a very personal experience and can in truth never be assessed by another. “Is it possible, in the final analysis, for one human being to achieve perfect understanding of another?” (Murakami 2011). But one can, as an observer, recognise signs that a fellow human is in pain (Cowen et al. 2015) and indeed, this empathetic assessment is frequently used to grade young children’s pain (Slater et al. 2008) and others who cannot comprehend or communicate their suffering (Herr et al. 2006).
The objective assessment of movement seems to be, at first glance, a much more straightforward task. Movement is easily seen by an observer and can be graded. A simplistic assessment would presume that the observation of a movement is not so different to the subjective experience of a movement. Many movements are straightforward to initiate, requiring no conscious effort; that they occur almost without willing them. Yet even this assessment of observed or measured motor function has encountered challenges.
The recovery of motor function following denervation (occurring naturally or after surgical intervention) is a slow one; From a long period of flaccid paralysis to the first flickers of volitional contraction to a plateau of functional gain takes months. In objectively assessing development of this recovered muscle function, force has become the ‘headline figure’. Other aspects of force such as; sustainability and fatigue, control and proprioception, grade-ability of increase or release of force have been overlooked or ignored. Beyond this even the assessment of muscle force has become simplified. Force has been equated solely and synonymously with maximal voluntary contraction (MVC). Manual muscle testing (MMT) of MVC has been established by consensus over generations of clinicians as the uni-modal assessment of choice when assessing this important characteristic of neurologic disease.
There has been a recognition, present since early history (having been recorded over 3500 years ago in Genesis 32 (Hoenig 1997) of the importance of the assessment of muscular wasting, paralysis and weakness from nerve injury. There is a good review of this history in, a historical essay tracing the ‘history of scoring and assessment of neuromuscular weakness as part of daily neurological practice’ written by Dyck (2005).
The first modern record of a scale to try to assess weakness from neurologic dysfunction was published by Mitchell and Lewis (1886) in the United States, this collaboration between a celebrated American Civil war surgeon (Silas Weir Mitchell) and Neurologist (Morris J Lewis) sets the tone for the driving power of military experience and multidisciplinary collaboration which is seen continued unto the modern day. In their report (Mitchell & Lewis 1886) on 23 patients with ‘posterior sclerosis’ of the spinal cord they were not only the first to describe how to elicit a tendon reflex but also they also were the first to describe an alpha numeric scoring system – scoring the ataxia from this upper motor neurone lesions as
Class 1: normal
Class 2: slight impairment,
Class 3: great impairment,
Class 4: paralysis
With further reference to muscle reflexes (an assessment of the muscle control arc) scored as 0 (absent) = (very slight) – (slight) N (normal) + (marked) and + + (very marked).
The next recognised step forward came from the Mayo Clinic in Rochester where according to research by Dyck (2005) the work of Henry Plummer from 1910 at this clinic extended the use of a ordinal numerical scale and the use of + and – for muscle weakness. Their scale (still used in many American institutions) begins with 0 (normal) then-1 weak, to -4 (being absent). This scale did not appear in publication until 1956 (Bastron 1956).
Wilhelmine Wright writing in the Boston medical and surgical journal in 1912 (Wright 1912) regarding her experiences of Polio in Boston and in Berlin writes of her ‘rough method’ of classifying the muscles according to the amount of resistance they can overcome is the following :—
- Muscle capable of overcoming gravity and outside force—normal.
- Muscle capable of overcoming gravity alone—good.
- Muscle capable of overcoming friction of joint and table—fair.
- Muscle capable of overcoming friction only when assisted—poor.
- Muscle incapable of any contraction—bad.
Robert Lovett a Professor of Orthopaedic Surgery in Boston Mass. USA, published his rather simplistic non- numerical rank scale in 1916 (Lovett 1916).
Fair (able to move against gravity) and
Good (able to move against resistance).
In 1939 Kendall and Kendall (H. O. Kendall et al. 1971) when assessing the motor loss and recovery in cases from the Polio epidemic they empirically graded a manual assessment equating fair with 50% strength and good with 80% strength.
The assessment of manual muscle testing was advanced under the Chairmanship of the Medical Research Council committee 1941 of Brigadier Riddoch in the pamphlet “Aids to the investigation of peripheral nerve injuries (war memorandum no 7)” HMSO, London (subsequently revised in 1943). The MRC scale was established as a post war tool for manual muscle testing (MMT) to grade the recovery of nerve injuries, (rather than as previous scales have been addressing deteriorating medical neurology).
Thus the improvement from paralysis the scale starts with 0 for no function and progress upwards to 5 for the measurement of peak Power.
0 No contraction
1 flicker or trace of contraction
2 Active movement with gravity eliminated
3 Active movement against gravity
4 Active movement against gravity and resistance
5 Normal power.
This publication’s popularity and worldwide recognition is most probably due to its simplicity, and educational illustrations on how limb muscles should be tested (see picture 4.1). Various versions of the MRC report have subsequently been published that have aimed to improve the methods for muscle examination. The revision of this work in 1976 “MRC Memorandum No 45. (superseding War memorandum No7) Aids to the examination of the peripheral nervous system”, HMSO, London (Committee medical 1976) includes recognition that
“Grades 4-, 4 and 4+ may be used to indicated movement against slight, moderate and strong resistance respectively” (Committee medical 1976)
It is this scale that has held a pre-eminent position in muscle force assessment clinically and in research outcomes for the past three generations. The author has conducted a recent review [Chapter 5.5.1] of leading clinicians (n=18) across the world on their preference for recording muscle force and 100% used the MRC system. The MRC system has been central to international medical education for it is easy to understand and (until the introduction of the arbitrarily assessed graduations of slight, moderate and strong resistance with grade 4) highly reproducible and valid. The most recent (2010) edition of ‘Aids to the Investigation of Peripheral Nerve Injuries, Medical Research Council: Nerve Injuries Research Committee’ starts with a historical review and appreciation for its application over the years (Compston 2010).
In 1983 Kendall and McCreary (F. P. Kendall & MacCreary 1983) revisited the work of Kendall and Kendall in the 1940s within the 0-to-5 scale. They equated 4/5 strength with a level of force called ‘good’ and suggested (with out clear justification) that this should be considered as representing 80% of full strength. None of these estimations of force seemed to relate to the descriptions given in the MRC system.
This clash of an empirical feeling of what ‘good’ and ‘fair’ outcomes are and how these lie within the agreed classifiable boundaries (MRC) sat poorly together. What was needed was a clarification between what researchers and clinicians thought was good and fair and what the demonstrable boundaries (as flawed as they were) demonstrated. This clarification came from MacAvoy in a robust cadaveric biomechanical analysis (MacAvoy & Green 2007) that 4% of a muscle’s possible range of force is required for function against gravity. From this work; MRC 4 can be equated to a statement that it is “at least 4% of full force”. Further if MRC 5 is taken as 95-100% then 91% of power range is contained within MRC grade 4 (MacAvoy & Green 2007). The following graphic represents this distribution.
Graphic representation of distribution of percentage muscle force required to attain differing levels of MRC assessed Force. Based on MacAvoy & Green 2007.
Despite being a cardinal feature of daily neurological practice (and long before MacAvoy’s study), it had been recognised that the MRC scale was not an ideal too, perhaps due to this inequity of its categories (with Grades 1, 2 and 3 being too narrow, and 4 being too broad) being appreciated. This lead to many attempts to modify the scale (Brandsma et al., 1995; Dyck et al., 2005; Cuthbert and Goodheart, 2007; MacAvoy and Green, 2007; Merlini, 2010). Many adding subdivisions 4-, 4, 4+, or starting to quantify within the grade of 4 the ability to lift certain given weights. These attempts however well intended still do not provide the ideal continuous scale for assessment of maximal muscle force.
All of the scoring systems described (whether; (normal, good, or fair) (5, 4, or 3) or including + or – grades) are subjective descriptions of strength, not an objective measure. They are ordinal numbers: only the order of the numbers is meaningful, whereas the distance between two numbers or grades does not lend itself to practical interpretation, and cannot be the basis for meaningful arithmetic operations (even though many publications quote non-integer outcomes).
Having assessed the validity of approach of MRC grading of manual muscle testing (MMT), it is important to know if the method of assessment is reproducible. The inter-tester and intra-tester reliability of MMT graded with the MRC system has been shown to be acceptable (Hislop & Montgomery 2007). Another problem with this system is inherent inter-subject variability in muscle strength, this weakness makes it useful primarily for intra-subject changes in strength rather than inter-subject comparisons (James 2007). In addition, as James has pointed out “this system tempts the examiner to consider a muscle with a certain grade of strength as having the same degree of recovery as another muscle with the same grade, when in fact the amount of recovery necessary to enable the deltoid to be graded 3 may be considerably different than the amount of recovery necessary to enable a wrist extensor to be graded 3” (James 2007; MacAvoy & Green 2007).
It is now through necessary that if clinicians, in nerve surgery, are to pursue improvements in outcome for patients a scale more responsive to differences within the MRC grade 4 range is required. This should be a continuous numerical scale where a force can be recorded as any value between 0 and the full power where there is an infinite range of possibilities between these two outcomes. This will then allow a statistical comparison of differing populations to assess if any specific intervention has been beneficial.
With the advent of mechanical testing came the ability to measure muscle testing with accuracy using first mechanical and then more recently electronic means. The isokinetic dynamometer is a lab-based device which offers a very high reliability and validity for a variety of biomechanical assessments. Isokinetic dynamometers, such as the Cybex (USA) (Rowell 1988), the Biodex (USA) (Valovich-McLeod et al. 2004), or the model D60107MK1 Penny and Giles transducers Christchurch, Hampshire) (Quick et al. 2016) [Chapter six] can measure number of properties such as dynamic peak torque, peak torque angle, angle-specific torque, power, and energy used. Their use is, however, not applicable in the standard clinic environment and thus their utility is limited. Thus the hand-held dynamometer was developed and with with techniques to maximise its reproducibility and reliability. The assessment of force with a hand-held dynamometer have historically been shown to be valid in both adults and children (Bohannon 1995; Bohannon 1997).
Standard hand held dynamometers (HHD) can be used for the maximum volitional contraction (MVC) assessment they are practical and inexpensive.
“We conclude that a hand-held dynamometer and a fixed dynamometer yield comparable results in patients with neuromuscular disease, provided that testing is limited to muscle groups producing relatively low forces” (Brinkmann 1994)
Reproducibility studies have shown a high intra-class correlation coefficient (0.91–0.97) and low SEm (standard error of measurement) (3%) in all muscle regions tested (Colombo et al. 2000) showing intra-class correlation coefficient of 0.96 for elbow flexion.
Kilmer (Kilmer et al. 1997) agrees with a very similar reliability finding and stating Hand Held Dynamometry ‘appears to be a reliable method to measure maximal isometric strength in persons with neurogenic weakness, and may be useful to quickly and objectively evaluate strength in the clinical setting’.
Wiles & Karni 1983 reporting the Queen’s Square experience states that
“For most muscle and some peripheral nerve disorders it is change in strength which is the ultimate manifestation of improvement or deterioration in the underlying disease.” (Wiles & Karni 1983)
This paper finds
“In conclusion we find that several muscle groups in patients with peripheral neuromuscular disorders can be satisfactorily and reproducibly measured using the hand held myometer …and suggest that the technique is highly appropriate for routine clinical application.”
Stark et al (2011) after undertaking a large systematic review on the comparison of HDD with isokinetic dynamometer conclude-
“Compared with isokinetic devices this instrument [HDD] can be regarded as a reliable and valid instrument for muscle strength assessment in a clinical setting.” (Stark et al. 2011)
A frequently identified criticism of HDD is that when measuring subnormal strength in strong muscle groups it can be that these will exceed the strength of the tester and thus be under estimated (Visser et al. 2003). Whilst this may introduce an error in theoretical application; the muscle and the population under study will uncommonly overcome the examiner.
“Perhaps World War II surgeons using early techniques of nerve repair were gratified to achieve grade 3 strength in a previously paralyzed muscle, and the differences between grades 3, 4, and 5 did not concern them, because this level of recovery was usually not attained”
“Modern techniques may achieve better results and engender higher expectations of a measurement system.”
“Unless HHD is widely adopted or until a better grading system is developed and well validated, the MRC will continue to be used.” James 2007
The argument is clear for the need, validity and reproducibility of HHD for measuring muscle force. The aim now, necessarily, must progress to consider other aspects of muscle re-innervation recovery other than peak volitional force. The patient’s experience is central to this exploration. The history of assessing outcomes has evolved from physician assessed to patient assessed outcomes. This has been driven by a so called revolution in health care (Relman 1991) which works towards outcome measures that have validity to talk to improvement across a wide spectrum of influence (Swiontkowski et al. 1999): quality of life (de Putter et al. 2014), satisfaction (Hamilton et al. 2013), function (Hudak et al. 1996), right down to to specific object orientated outcomes (Waljee et al. 2014).
The history of assessment of muscle function has been, almost without other focus, centred around the assessment of maximal volitional force. This focus has grown and developed over time until (in recognition of the flaws of a discrete system) the technologic advances have made continuous peak force assessment possible.
Recognising other features of motor recovery function will provide more detail in assessing outcomes and these methods will undoubtedly now become more and more the focus of assessing the outcomes of re-innervated muscle function.
Bastron, J.A., 1956. Clinical examinations in neurology,
Bohannon, R.W., 1995. INTERNAL CONSISTENCY OF DYNAMOMETER MEASUREMENTS IN HEALTHY SUBJECTS AND STROKE PATIENTS. Perceptual and motor skills, 81(3f), pp.1113–1114.
Bohannon, R.W., 1997. Internal consistency of manual muscle testing scores. Perceptual and motor skills, 85(6), p.736.
Brinkmann, J.R., 1994. Comparison of a hand-held and fixed dynamometer in measuring strength of patients with neuromuscular disease. Journal of Orthopaedic & Sports Physical Therapy, 19(2), pp.100–104.
Colombo, R. et al., 2000. Measurement of isometric muscle strength: a reproducibility study of maximal voluntary contraction in normal subjects and amyotrophic lateral sclerosis patients. Medical engineering & …, 22(3), pp.167–174.
Committee, M.R.C.G.B.N.I.medical, 1976. Memorandum No. 45 – Aids to the Examination of the Peripheral Nervous System, H.M. Stationery Office.
Compston, A., 2010. Aids to the Investigation of Peripheral Nerve Injuries. Medical Research Council: Nerve Injuries Research Committee. His Majesty’s Stationery Office: 1942; pp. 48 (iii) and 74 figures and 7 diagrams; with Aids to the Examination of the Peripheral Nervous System. By Michael O’Brien for the Guarantors of Brain. Saunders Elsevier: 2010; pp.  64 and 94 Figures. Brain, 133(10), pp.2838–2844.
Cowen, R. et al., 2015. Assessing pain objectively: the use of physiological markers. Anaesthesia, 70(7), pp.828–847.
de Putter, C.E. et al., 2014. Health-related quality of life after upper extremity injuries and predictors for suboptimal outcome. Injury, 45(11), pp.1752–1758.
Dyck, P.J. et al., 2005. History of standard scoring, notation, and summation of neuromuscular signs. A current survey and recommendation. Journal of the Peripheral Nervous System, 10(2), pp.158–173.
Hamilton, D.F. et al., 2013. What determines patient satisfaction with surgery? A prospective cohort study of 4709 patients following total joint replacement. BMJ open, 3(4), pp.e002525–8.
Herr, K. et al., 2006. Pain Assessment in the Nonverbal Patient: Position Statement with Clinical Practice Recommendations. Pain Management Nursing, 7(2), pp.44–52.
Hislop, H.J. & Montgomery, J., 2007. Daniels and Worthingham’s Muscle Testing, Saunders.
Hoenig, L.J., 1997. Jacob’s limp. Seminars in Arthritis and Rheumatism, 26(4), pp.684–688.
Hudak, P.L., Amadio, P.C. & Bombardier, C., 1996. Development of an upper extremity outcome measure: the DASH (disabilities of the arm, shoulder, and hand). American Journal of ….
James, M.A., 2007. Use of the Medical Research Council Muscle Strength Grading System in the Upper Extremity. The Journal of Hand Surgery, 32(2), pp.154–156.
Kendall, F.P. & MacCreary, E.K., 1983. Muscles, Williams & Wilkins.
Kendall, H.O., Kendall, F.P. & Wadsworth, G.E., 1971. Muscles, testing and function, Williams & Wilkins.
Kilmer, D.D. et al., 1997. Hand-held dynamometry reliability in persons with neuropathic weakness. Archives of Physical Medicine and Rehabilitation, 78(12), pp.1364–1368.
LOVETT, R.W., 1916. CERTAIN ASPECTS OF INFANTILE PARALYSIS. Journal of the American Medical Association, LXVI(10), p.729.
MacAvoy, M.C. & Green, D.P., 2007. Critical Reappraisal of Medical Research Council Muscle Testing for Elbow Flexion. The Journal of Hand Surgery, 32(2), pp.149–153.
Mitchell, S.W. & Morris, J.L., 1886. THE TENDON-JERK AND MUSCLE-JERK IN DISEASE, AND ESPECIALLY IN POSTERIOR SCLEROSIS1. The American Journal of the Medical Sciences, 184, pp.363–372.
Murakami, H., 2011. The Wind-Up Bird Chronicle, Random House.
Quick, T.J. et al., 2016. A quantitative assessment of the functional recovery of flexion of the elbow after nerve transfer in patients with a brachial plexus injury. Bone Joint ….
Relman, A., 1991. Assessment and Accountability: The Third Revolution in Medical Care. Journal of Diagnostic Medical Sonography, 7(2), pp.107–107.
Rowell, M.A., 1988. Isokinetic strength testing of the elbow joint using the Cybex II dynamometer.
Slater, R. et al., 2008. How Well Do Clinical Pain Assessment Tools Reflect Pain in Infants? A. D. Edwards, ed. PLoS Medicine, 5(6), p.e129.
Stark, T. et al., 2011. Hand-held Dynamometry Correlation With the Gold Standard Isokinetic Dynamometry: A Systematic Review. PM&R, 3(5), pp.472–479.
Swiontkowski, M.F., Buckwalter, J.A. & Keller, R.B., 1999. Symposium-The Outcomes Movement in Orthopaedic Surgery: Where We Are and Where We Should Go. J Bone Joint Surg ….
Valovich-mcLeod, T.C. et al., 2004. Reliability and validity of the Biodex system 3 pro isokinetic dynamometer velocity, torque and position measurements. European journal of …, 91(1), pp.22–29.
Visser, J., Mans, E. & de Visser, M., 2003. Comparison of maximal voluntary isometric contraction and hand-held dynamometry in measuring muscle strength of patients with progressive lower motor neuron …. Neuromuscular ….
Waljee, J. et al., 2014. Patient expectations and patient-reported outcomes in surgery: A systematic review. Surgery, 155(5), pp.799–808.
Wiles, C.M. & Karni, Y., 1983. The measurement of muscle strength in patients with peripheral neuromuscular disorders. Journal of Neurology, 46(11), pp.1006–1013.
WRIGHT, W.G., 1912. Muscle training in the treatment of infantile paralysis. The Boston Medical and Surgical Journal, 167(17), pp.567–574.