Residue numbering



While figuring out the (easy) numbering scheme, having the GpHR Alignment in another window will be very helpful!

The transmembrane region

Let's start off with the already described part of GPCR numbering.
While we would love to have only one numbering scheme for the easy identification of conserved residues, a number of numbering systems for GPCRs have been suggested and used.
The GPCRDB numbering (or Oliveira numbering) and Ballesteros numbering are the most useful, but each of them still has its anomalies. Instead of giving the same decimal number to the most conserved residue in each TM, we see that for the leucine in TM2, the GPCRDB number is 220, but for the arginine in TM3 it is 340. Ideal would have been 250 and 350.
The Ballesteros numbering is more straightforward and uses indeed the "50" decimal to pinpoint the most conserved residue. Too bad they made a miscalculation in the conservation of residues. According to them D2.50 and P5.50 are the most conserved residues in TMs 2 and 5, but taking all GPCR sequences into consideration (and calculated with the Alignment Explorer), the most conserved residue in TM2 is the Leucine (2.46) (as in the GPCRDB scheme) and the one in TM5 is the Tyrosine (5.58).

Both numbering schemes have been implemented in the GlycoProtein Hormone Receptor Mutation Database (GPMD), so you can use the scheme you feel most comfortable with.

Below is a table for the interconversion between the Ballesteros and the GPCRDB numbering, using the sequence of the Human TSHR as example. Because it seems clear that also Helix 8 is conserved in the Class A GPCRs, we have introduced a numbering for the loop between helix 7 and 8 (has the same length in all Class A GPCRs) and for Helix 8. The numbering is an extension to the Ballesteros numbering (I call it Ballesteros-Extended) and starts right after 7.54 with 8.45 and we annotated the conserved phenylalanine in Helix 8 as 850.

TM 1
Residue

M
G
Y
K
F
L
R
I
V
V
W
F
V
S
L
L
A
L
L
G
N
V
F
V
L
L
I
L
L
T
Ballesteros

1.30
1.31
1.32
1.33
1.34
1.35
1.36
1.37
1.38
1.39
1.40
1.41
1.42
1.43
1.44
1.45
1.46
1.47
1.48
1.49
1.50
1.51
1.52
1.53
1.54
1.55
1.56
1.57
1.58
1.59
GPCRDB

110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
TM 2
Residue

V
P
R
F
L
M
C
N
L
A
F
A
D
F
C
M
G
M
Y
L
L
L
I
A
S
V
D
L
Y
T
Ballesteros

2.38
2.39
2.40
2.41
2.42
2.43
2.44
2.45
2.46
2.47
2.48
2.49
2.50
2.51
2.52
2.53
2.54
2.55
2.56
2.57
2.58
2.59
2.60
2.61
2.62
2.63
2.64
2.65
2.66
2.67
GPCRDB

212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
TM 3
Residue

G
P
G
C
N
T
A
G
F
F
T
V
F
A
S
E
L
S
V
Y
T
L
T
V
I
T
L
E
R
W
Y
A
I
Ballesteros

3.22
3.23
3.24
3.25
3.26
3.27
3.28
3.29
3.30
3.31
3.32
3.33
3.34
3.35
3.36
3.37
3.38
3.39
3.40
3.41
3.42
3.43
3.44
3.45
3.46
3.47
3.48
3.49
3.50
3.51
3.52
3.53
3.54
GPCRDB

312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
TM 4
Residue

R
H
A
C
A
I
M
V
G
G
W
V
C
C
F
L
L
A
L
L
P
L
V
Ballesteros

4.40
4.41
4.42
4.43
4.44
4.45
4.46
4.47
4.48
4.49
4.50
4.51
4.52
4.53
4.54
4.55
4.56
4.57
4.58
4.59
4.60
4.61
4.62
GPCRDB

410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
TM 5
Residue

L
A
L
A
Y
I
V
F
V
L
T
L
N
I
V
A
F
V
I
V
C
C
C
Y
V
K
Ballesteros

5.35
5.36
5.37
5.38
5.39
5.40
5.41
5.42
5.43
5.44
5.45
5.46
5.47
5.48
5.49
5.50
5.51
5.52
5.53
5.54
5.55
5.56
5.57
5.58
5.59
5.60
GPCRDB

505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
TM 6
Residue

D
T
K
I
A
K
R
M
A
V
L
I
F
T
D
F
I
C
M
A
P
I
S
F
Y
A
L
S
A
I
L
Ballesteros

6.30
6.31
6.32
6.33
6.34
6.35
6.36
6.37
6.38
6.39
6.40
6.41
6.42
6.43
6.44
6.45
6.46
6.47
6.48
6.49
6.50
6.51
6.52
6.53
6.54
6.55
6.56
6.57
6.58
6.59
6.60
GPCRDB

600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
TM 7 - Helix 8
Residue

N
S
K
I
L
L
V
L
F
Y
P
L
N
S
C
A
N
P
F
L
Y
A
I
F
T
K
A
F
Q
R
D
V
F
I
L
L
S
K
Ballesteros+Helix8

7.33
7.34
7.35
7.36
7.37
7.38
7.39
7.40
7.41
7.42
7.43
7.44
7.45
7.46
7.47
7.48
7.49
7.50
7.51
7.52
7.53
7.54
7.55
7.56
8.47
8.48
8.49
8.50
8.51
8.52
8.53
8.54
8.55
8.56
8.57
8.58
8.59
8.60
GPCRDB

713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
807
808
809
810
811
812
813
814
815
816
817
818
819
820


The intra- and extracellular loops

All GpHRs have an equal number of residues in the loop regions between the transmembrane helices. We have therefore decided to number the loop residues. The numbering of the first loop residue begins with an 'L' (indicating Loop), followed by the number of the TM before the loop, followed by the first 10th decimal. From there on we number downstream. Applying this numbering scheme (using hTSHR as example) makes it possible to query the database for any loop residue among the three receptor classes:

Loop1-2
Residue

S
H
Y
K
L
N
Number

L110
L111
L112
L113
L114
L115
Loop2-3
Residue

H
S
E
Y
Y
N
H
A
I
D
W
Q
T
Number

L210
L211
L212
L213
L214
L215
L216
L217
L218
L219
L220
L221
L222
Loop3-4
Residue

T
F
A
M
R
L
D
R
K
I
R
L
Number

L310
L311
L312
L313
L314
L315
L316
L317
L318
L319
L320
L321
Loop4-5
Residue

G
I
S
S
Y
A
K
V
S
I
C
L
P
M
D
T
E
T
P
Number

L410
L411
L412
L413
L414
L415
L416
L417
L418
L419
L420
L421
L422
L423
L424
L425
L426
L427
L428
Loop5-6
Residue

I
Y
I
T
V
R
N
P
Q
Y
N
P
G
D
K
Number

L510
L511
L512
L513
L514
L515
L516
L517
L518
L519
L520
L521
L522
L523
L524
Loop6-7
Residue

N
K
P
L
I
T
V
S
Number

L610
L611
L612
L613
L614
L615
L616
L617


The extracellular domain (ECD)

To allow easy and consequent searching throughout the GPMD and furthermore to facilitate the communication between researchers of different GpHRs, we have introduced a general numbering scheme for the ECD of the three major classes of GpHRs (TSHR, FSHR, LHR). Such a numbering scheme can only be introduced when the ECDs of all these receptors are perfectly alignable, and as we know, they all are.

Smits et al have described in their EMBO paper a canonical LRR motif : X1 X2 L X3 L X4 X5
X3 is a hydrophylic residue sticking out into the center of the horseshoe shaped ECD and lies between the two hydrophobic residues (L, I, V), which confine the Leucine-rich repeat structure. The picture below was taken from the paper and annotates these residues in the hTSHR, hLH/CGr and hFSHR.



Of course, we cannot use this limited numbering scheme with X's and L's, but we have decided to assign the numbers 1050, 2050, 3050, 4050, 5050, 6050, 7050, 8050 and 9050 to the X3 residue in the LRR1, LRR2, LRR3, LRR4, LRR5, LRR6, LRR7, LRR8 and LRR9, respectively.
These residues in the hTSHR are indicated in blue in the picture below (don't mind 4050, it's a little glycine):



Picture made with YASARA (http://www.yasara.org)

We consider a Leucine Rich Repeat as a beta strand followed by a coil, a helical region and again a small coil: Strand-Coil-Helix-Coil
We begin the numbering of each LRR just before the strand starts. The first residue of each LRR is indicated in the picture below in purple.
First we assign the 1050, 2050, etc to the X3 residue and from there we number upstream and downstream until the next or previous LRR.



Picture made with YASARA (http://www.yasara.org)

And here at last, the complete numbering of the ECD, using hTSHR as an example:

LRR 1
Residue

T
Q
T
L
K
L
I
E
T
H
L
R
T
I
P
S
H
A
F
S
N
L
P
N
Number

1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
LRR 2
Residue

I
S
R
I
Y
V
S
I
D
V
T
L
Q
Q
L
E
S
H
S
F
Y
N
L
S
K
Number

2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
LRR 3
Residue

V
T
H
I
E
I
R
N
T
R
N
L
T
Y
I
D
P
D
A
L
K
E
L
P
Number

3046
3047
3048
3049
3050
3051
3052
3053
3054
3055
3056
3057
3058
3059
3060
3061
3062
3063
3064
3065
3066
3067
3068
3069
LRR 4
Residue

L
L
K
F
L
G
I
F
N
T
G
L
K
M
F
P
D
L
T
K
V
Y
S
T
D
I
Number

4045
4046
4047
4048
4049
4050
4051
4052
4053
4054
4055
4056
4057
4058
4059
4060
4061
4062
4063
4064
4065
4066
4067
4068
4069
4070
LRR 5
Residue

F
F
I
L
E
I
T
D
N
P
Y
M
T
S
I
P
V
N
A
F
Q
G
L
C
N
Number

5046
5047
5048
5049
5050
5051
5052
5053
5054
5055
5056
5057
5058
5059
5060
5061
5062
5063
5064
5065
5066
5067
5068
5069
5070
LRR 6
Residue

E
T
L
T
L
K
L
Y
N
N
G
F
T
S
V
Q
G
Y
A
F
N
G
T
Number

6045
6046
6047
6048
6049
6050
6051
6052
6053
6054
6055
6056
6057
6058
6059
6060
6061
6062
6063
6064
6065
6066
6067
LRR 7
Residue

K
L
D
A
V
Y
L
N
K
N
K
Y
L
T
V
I
D
K
D
A
F
G
G
V
Y
S
Number

7045
7046
7047
7048
7049
7050
7051
7052
7053
7054
7055
7056
7057
7058
7059
7060
7061
7062
7063
7064
7065
7066
7067
7068
7069
7070
LRR 8
Residue

G
P
S
L
L
D
V
S
Q
T
S
V
T
A
L
P
S
K
G
L
E
Number

8045
8046
8047
8048
8049
8050
8051
8052
8053
8054
8055
8056
8057
8058
8059
8060
8061
8062
8063
8064
8065
LRR 9
Residue

H
L
K
E
L
I
A
R
N
T
W
T
L
K
K
L
P
L
S
L
S
Number

9045
9046
9047
9048
9049
9050
9051
9052
9053
9054
9055
9056
9057
9058
9059
9060
9061
9062
9063
9064
9065


The hinge region

The hinge part right after the last LRR and the part just before TM1 share a good amount of conservation throughout the entire GpHR family. Since a few very conserved and well-studied residues and motifs can be found here, we have numbered these alignable parts of the hinge region. We have divided it in two parts: a stretch after LRR9 and a stretch before TM1.

The key residue of the first part is labeled H50 and is the well-conserved serine of the SHCC motif. From there we number up- and downstream from H38-H61. The key residue of the second part was labeled H150 and chosen to be the aspartate of the well-conserved F/YDY motif. From there we number up- and downstream from H143-H175.

Hinge part 1
Residue

F
L
H
L
T
R
A
D
L
S
Y
P
S
H
C
C
A
F
K
N
Q
K
K
I
Number

H38
H39
H40
H41
H42
H43
H44
H45
H46
H47
H48
H49
H50
H51
H52
H53
H54
H55
H56
H57
H58
H59
H60
H61
Hinge part 2
Residue

Q
A
F
D
S
H
Y
D
Y
T
I
C
G
D
S
E
D
M
V
C
T
P
K
S
D
E
F
N
P
C
E
D
I
Number

H143
H144
H145
H146
H147
H148
H149
H150
H151
H152
H153
H154
H155
H156
H157
H158
H159
H160
H161
H162
H163
H164
H165
H166
H167
H168
H169
H170
H171
H172
H173
H174
H175