Hydrogen Bond Dynamics of Histamine Monocation in Aqueous Solution : How Geometric Parameters Influence the Hydrogen Bond Strength †

Chemometric statistical approaches involving multiple linear regression (MLR) and principal component analysis (PCA) were employed on a set of 42 distinct snapshot structures of the physiological histamine monocation in aqueous solution along the Car-Parrinello molecular dynamics trajectory, in order to obtain a better insight into the relationship between the geometry parameters of the system and the resulting νNH stretching frequencies. A simple 2D linear regression of νNH with Namino···Owater distances gave a very poor correlation (R2 = 0.42), but both MLR and PCA with the inclusion of four directly bonded water molecules offered a notably predictive model that is even able to distinguish two classes of structures based on the Cl– counterion position. Taking into account waters from the first, second and third solvation shells, sequentially diminished the overall predictive ability of the model, yet increased the number of useful predictors that, in the largest model with 51 solvent molecules, all correspond to bulk water, implying that both chemometric methods are consistent in suggesting that fundamental histamine N–H stretching vibrations are very complex in nature and strongly coupled to the fluctuating environment.


INTRODUCTION
Histamine is a biogenic amine and is mostly known for its role as an inflammation mediator.Its functions as a neuromodulator and neurotransmitter in the brain are less well established, but its known scope of actions is expanding.Recently it has gained importance as a signaling molecule in the processes of sleep-wake cycle, appetite control, learning, memory and emotion.Moreover, its signaling paths seem to be involved in conditions such as depression, schizophrenia, Alzheimer's disease and epilepsy. 1Consequently, processes involving histamine synthesis, transport, metabolism and binding to macromolecules have a significant physiological relevance.In the brain, histamine is synthesized from histidine by a specific enzyme L-histidine decarboxylase, and is taken up into synaptic vesicles by the vesicular monoamine transporter-2. 2 It is released into synaptic cleft upon depolarisation stimuli and like every other neurotransmitter it has to be removed from synaptic cleft in a relatively short time.The primary histamine metabolising enzyme in the brain seems to be an intracellular histamine-N-methyl-transferase, which produces N-methyl-histamine that is later converted to the corresponding aldehyde by monoamine oxidase (MAO).In other pathways, histamine is degraded by diamine oxidase (DAO) and to some extent directly by MAO.
Concerning the nature of molecular interactions between histamine and its larger biological systems, these are dominated by the histamine hydrogen bonding ability, 3 just as it is the case when it is surrounded with solvent water molecules under physiological conditions.Therefore, the hydration of histamine is of considerable interest, particularly with regard to structures and vibrational spectra.Histamine is composed of an imidazole ring and an aminoethyl side-chain, and according to its pK a values (5.8 for the imidazole ring nitrogen atom and 9.4 for the aliphatic amino group), 4 histamine is predominantly a monocation (96 %) protonated at the free-amino position. 5In cells, histamine recognition induces (at least) partial desolvation, implying that a portion of water molecules from hydration shells must be removed to make a way for the interac-Croat.Chem.Acta 87 (2014) 397.
tion with proteins and enzymes.A very illustrative example is provided by the histamine receptors H1-H4, which are all activated through the hydrogen bonding to histamine, 6 suggesting that it is highly desirable to properly understand its interactions with physiological water and biological macromolecules.The latter is in line with recent molecular dynamics simulations by Keserű and co-workers 7 that emphasized the role of water-histamine interactions at the binding site of the human histamine H4 receptor, being in agreement with several previous reports. 8ecently, we modeled the nature of the hydration of physiological histamine monocation by using Car-Parrinello molecular dynamics simulations (CPMD) 9 and a static cluster model. 10In the former study, we simulated the N-H stretching band envelope, which are degrees of freedom that predominantly influence the strength of the hydrogen bonding as probed by the vibrational spectroscopy.The approach involved the construction of one dimensional proton potentials on snapshot structures extracted from a CPMD trajectory, followed by solving the one-dimensional vibrational Schroedinger equation.By separating the amino N-H vibrations of the protonated ethylamino chain from those within the imidazole ring in the experimental spectra, we clearly demonstrated that the former moiety forms stronger hydrogen bonds with the surrounding water molecules, 9,10 which should be significant for the biological function of histamine.It remains a challenge to identify how the fluctuating intra-and intermolecular geometric parameters influence the N-H stretching frequencies, which is the aim of the present work.The most important factor seems to be the proton donor-proton acceptor distance among histamine amino nitrogens and water oxygen atoms.Careful comparisons between the experimental donor-acceptor distances (taken from the frozen crystal structures) and the corresponding stretching frequencies for a series of related systems are known as Novak 11 and Mikenda 12 relations.Analysis of the CPMD trajectory offers an advantage that it is performed for several configurations of the same system and that all other geometric features could be considered.This approach was previously used in the case of a very strong hydrogen bonding in the crystals of sodium hydrogen bis(sulfate), 13,14 where, remarkably, no appreciable correlation between the instantaneous geometric parameters and the pertinent OH stretching frequency could be found, demonstrating the extreme complexity of the short and strong hydrogen bond.In contrast, aqueous histamine forms relatively long and weak hydrogen bonds with water; 9,10 hence part of the motivation for the present study is a comparison of the geometry-frequency correlation patterns between strong and weak hydrogen bonds.

COMPUTATIONAL DETAILS
As part of our interest in examining the hydrogen bond dynamics of the histamine monocation in aqueous solution, 9,10 the present work analyzes the snapshot structures along the CPMD trajectory with chemometric methods -a combination of the mathematical and statistical approaches designed to provide maximum chemical information of the chemical system by analyzing chemical data.For the CPMD simulation details, extraction of snapshot geometries, and the calculation of instantaneous proton potentials and thus derived anharmonic frequencies, the reader is referred to reference 9. Briefly, we collected 42 distinct geometries during 10 ps simulation, involving the histamine monocation, 93 water molecules and Cl -counterion, and since each snapshot structure consists of four N amino •••O water moieties, we tailed a set of 168 anharmonic frequencies.
In order to study the relationship between geometric parameters and the corresponding ν NH frequencies, we started the analysis by considering all internal coordinates of the studied system (termed predictors throughout the text), based on the fact that the main type of interactions in the simulation box is electrostatic in nature, which is essentially long-range in character, therefore the proton dynamics is controlled by surroundings that extend beyond the nearest atoms.However, this made a visual aspect of such multi-dimensional correlation a very difficult and challenging task.To overcome this problem, we reduced the data dimension by employing the Principal Component Analysis (PCA), being recommended as the most appropriate method for preprocessing multi-dimensional data into two dimensions. 15We were also interested in finding a relationship between a single dependent variable (ν NH ) and several independent variables (geometry parameters), for which we utilized the Multiple Linear Regression (MLR) as a calibration technique. 15,16s first, MLR and PCA analysis have been applied on the matrix of 42 × n dimensions, corresponding to 42 snapshot structures with n internal coordinates in columns (see Table 1).The elements of particular columns are given as a quotient of the structural parameters (bond distances, valence angles, dihedral angles).The total number of variables is obtained in the following way.For the system with N atoms, the number of internal degrees of freedom is 3N-6, yielding 888 unique internal coordinates that describe each snapshot structure with 298 atoms. 9Since one internal coordinate of each H-bonded moiety served as a "probe" coordinate along which the proton was displaced in a stepwise manner when calculating the corresponding proton potentials, we omitted it from our analysis.For this purpose, the periodic boundary conditions and replacement of water molecules were also considered.Our Croat.Chem.Acta 87 (2014) 397.
initial inspection suggested poor correlation between all geometry parameters and ν NH at best, giving evidence that the investigated HB interactions are extremely complex as was the case with the strong hydrogen bond in sodium hydrogen bis(sulfate). 14However, in contrast to the virtually nonexistent correlation reported in the latter study, here we analyzed each of the individual subsets of geometric parameters separately (bond distances, bond angles, dihedral angles), and found that when examining only bond distances there appears to be a certain correlation with ν NH values.Therefore, MLR and PCA have been applied on the reduced matrices of 42 × m dimensions, where m denotes the number of bond distances, yielding matrices M1-M4 with m = 30, 60, 129 and 171, respectively (Table 1).We further refined the process by sequentially increasing the number of considered water molecules, starting with four directly bonded molecules, and proceeding by employ- for the first, second and third solvation shells, corresponding to 15, 37 and 51 waters, respectively (Table 1, Figure 1).In addition, we categorized all datasets by assigning class numbers, firstly to the range of Cl , and secondly, to the N-H stretching frequencies (class attributes NH1: ν NH < 2400 cm -1 ; NH2: 2700 ≤ ν NH > 2400 cm -1 ; NH3: 3000 ≤ ν NH > 2700 cm -1 ; NH4: ν NH > 3000 cm -1 ).All calculations and preparation of plots were done with the Teach/Me software 17 using Teach/Me Data Analysis application.

Review of the Calculated Anharmonic Frequencies and the N-H Stretching Envelope
The calculated fundamental excitation frequencies span a wide range from 3229 cm -1 to 2136 cm -1 , being a consequence of very diverse snapshot geometries used to calculate proton potentials of different shapes. 9This Gaussian-type frequency distribution represents the N-H stretching envelope (Figure 2b) having the maximum absorption between 2820 cm -1 and 2730 cm -1 , and the center of gravity at 2799 cm -1 .According to the correlation scheme of Novak, 18 the latter frequency corresponds to the N amino •••O water distance of about 2.75 Å, which is around 0.1 Å shorter than the calculated CPMD-averaged value of 2.85 Å. 9 However, it should be noted that in contrast to the dynamic aqueous environment of the present system, Novak and co-workers established their scheme on a dataset of solid crystalline structures, hence this offset is likely to be of little significance.Another possible reason for a disagreement is  that the applied DFT method introduced systematic error in the calculation of proton potentials that is reflected in shifted frequencies.Nevertheless, our computed results are in excellent agreement with the experimental spectra in terms of both the shape and position of the N-H stretching band. 9

Multiple Linear Regression (MLR)
In the first stage, ν NH frequencies were visualized in a simple 2D linear regression with only N amino •••O water distances (Figure 2c), which yielded a very poor correlation (R 2 = 0.42), but still considerably better than found for the strong hydrogen bonding in crystalline sodium hydrogen bis(sulfate) (R 2 ≈ 0.2 at best).Since we were particularly interested in whether the hydrogen bond donor-acceptor distance is a better predictor of ν NH than other geometry parameters, at first, we applied the MLR technique to all four N amino •••O water moieties solvated by four directly bonded water molecules (Figure 2a).Since the mentioned fragments are located in different environments, for example relative to the position of the Cl -counterion, the obtained results differed, which led us to apply the MLR analysis to larger matrices of different sizes (see later).One way to express the overall predictive accuracy of multiple regression models is the R 2 value, 19 a measure of how much variation a model explains.With short N amino •••O water distances, the model is quite reasonable yielding R 2 values above 0.75 (Table 2), and for the hydrogen bonds labeled HB1-HB3 (Figure 2a) it provides significant correlation between geometry parameters and ν NH , while for the last one (labeled HB4) we found a weak correlation between data.Interestingly, for each of the four N amino  3), which increased the R 2 values in both cases.From the data we can conclude that the length of the C α -N amino distance is the most influential predictor in both the protonated 3 -NH  and ring-amino group, labeled as d7 and d1 in Figure 3, respectively.
Secondly, we upgraded our model by considering the coordinates of 15 water molecules from the first hydrated shell all with d(N amino •••O water ) < 4.5 Å (Table 2 and Figure 4).It follows that the water molecule labeled with Y is a useful predictor of only one N amino •••O water moiety, while water labeled X is not listed as predictor in studied cases (Figure 4b).For all four H-bonded moieties the common predictors within the histamine monocation are d2 and d7, together with d19 from the bulk.Further improvement was made by including more distant second hydration shell solvent molecules in the analysis (d(N amino •••O water ) < 6.4 Å) yielding a cluster of 37 waters.Again, Cl•••O water distances turned out to be good predictors, together with the majority of other predictors that are all corresponding to bulk water molecules.From the histamine monocation, one should consider d7, listed in HB1, HB2 and HB3, and d1 listed in HB2 and HB3 moieties.It is interesting that no predictors are listed in HB4.Finally, third hydration shell (d(N amino •••O water ) < 7.2 Å) with 51 water molecules was also considered.In this case, Cl•••O water distances are not good predictors, whereas the majority of predictors are geometry parameters of bulk water molecules.It is interesting that, the predictor d8 (N amino •••O water distance) can be found in HB1, HB2 and HB4, and not in HB3 moiety.Moreover, the important predictors from histamine monocation are again d7 in HB1, and d6 and d3 in HB4 group.
Summarizing this section, we can conclude that the linear correlation between geometry parameters and individual frequencies exists.It is also worth stressing out, that by increasing the number of solvent water molecules, the features related to the complexity of the proton motion and its coupling to the environment become more apparent.

Principal Component Analysis (PCA)
PCA was performed in order to get an insight into the overall correlativity of all 168 ν NH values with the  2), which are the most significant predictors for the hydrogen bonds HB1-HB4 (denoted with numbers "1"-"4") in the model with four directly bonded water molecules.Labels "a" and "b" correspond to ClO1 and ClO2 classes of structures, respectively.matching geometry parameters.Similarly to the MLR, PCA was applied for each H-bonded moiety HB1-HB4 on matrices composed of 42 rows and sets of 30, 60, 129, and 171 variables in columns (M1-M4), the latter corresponding to the number of considered internal bond distances in each case.The data was additionally preprocessed by using mean centered data, in other words, by subtracting the overall mean value from individual elements.
PCA analyzes data by yielding eigenvectors (principal compoments, PCs) and the corresponding eigenvalues (variances) by using the covariance matrix of the whole dataset.The majority of the information is usually gathered in the first few PCs; in our case, the smallest matrix M1 already exhibits such behavior as already around 98 % variance is explained in the first two principal components in all four N amino •••O water moieties (Table 3), with the most influencing variables being distances d10, d11, d12, and d13 (Figure 5).We explored a number of pair-wise plots of principal components and compared the corresponding score vs. score displays, and additionally categorized all datasets by assigning class numbers according to a range of both water Cl •••O distances (ClO1 and ClO2) and N-H stretching frequencies (NH1-NH4) (see Supplementary Materials).Interestingly, in M1, apart from the obvious N amino •••O water distances, one of the most influencing parameters is Cl•••O water distance (d12), in agreement with the MLR analysis presented earlier, which is basically responsible that the results clusters in ClO1 (having 25 structures) and ClO2 (having 17 structures) classes, with the prevalence of each determined by d9 and d14 variables, respectively (Table 3).
By increasing the number of water molecules in the model one also increases the number of the most influencing geometry parameters while diminishing the significance of the two classes based on the counterion position.In the case of the largest model M4 (51 water molecules), almost all significant geometry parameters correspond to bulk water molecules.The only exception is provided by d8, which is related to the N amino •••O water moiety.
Further, we were interested in how changing the number of variables affects variances in the first ten PC axes, and found that it exerts pronounced influence (Table 3).By increasing the number of variables, the content of variances decreases, while the percentage of the first three variances in the resulting PCs changes only distinctly.This could be ascribed to the Cl - counterion, which has a more profound effect in the smaller models, and to the complexity of the proton motion and its coupling to its environment.On the other hand, when comparing percentages of the first three variances within similar models (e.g.M1 model with the first shell waters) they are almost identical (Table 3).It follows that with PCA method we obtained similar conclusion as from the MLR studyinstantaneous fluctuating structure has a pronounced effect on the proton motion and its dynamic, which is very complex and differently coupled to the environment.
Along with other established computational methods, the presented approach will contribute towards understanding of the complexity of the hydrogen bonding dynamics in polar environments reflected in the vibrational spectra [20][21][22][23][24] and its relevance for biocatalysis. 25

CONCLUDING REMARKS
The aim of our study was to evaluate the correlation between the frequencies of the N-H stretching band and the corresponding geometry parameters of the snapshot structures of the histamine monocation in aqueous solution.A simple two-dimensional correlation scheme between ν NH and the matching N amino •••O water distances appeared not to have significant predictive ability.On the other hand, more advanced techniques, such as multiple linear regression (MLR) and principal component analysis (PCA), applied on preprocessed data yielded reasonably accurate predictors.For example, around 98 % variance in the data is well explained by the first two principal components in all four N amino •••O water moieties.Moreover, with PCA method applied on the cluster involving only four directly attached water molecules we were able to distinguish two classes of structures based on the counterion position.When the analysis was additionally tuned by assigning class numbers according to the value of the Cl•••O water distances, the percentage of the variance in the first two principal components in all N amino •••O water moieties decreases to around 90 %.By increasing the number of water molecules in the model one also increases the number of the most influencing geometry parameters and diminishes the relevance of the mentioned two structural classes that depend on the position of the Cl -counterion.With the largest model investigated here (51 water molecules), all influential geometry parameters arise from bulk water, and only around 20 % of the variance is explained in the first two principal components of four N amino •••O water groups.This might lead to a conclusion that a correlation between instantaneous fluctuating histamine structure and the N-H stretching frequencies is likely to be poor, and that the dynamic of the proton motion is very complex and strongly coupled to the fluctuating environment.The results obtained from the MLR analysis support this assumption.
Despite the fact that the structure-frequency correlation found for a set of dynamically sampled instantaneous snapshot structures within one system is only moderate, it represents a clear advancement in comparison with the previously investigated system of crystalline sodium hydrogen bis(sulfate), 14 and demonstrates once again that hydrogen bonding is an extremely complex interaction.For the mentioned inorganic material, no appreciable linear or even non-linear correlation could be found, indicating the limitations of the applied one-dimensional quantum treatment.Remarkably, the present system features much weaker and longer hydrogen bonds, which is reflected in the observed correlations between the instantaneous geometric parameters and the corresponding N-H stretching frequencies.Supplementary Materials.-Supporting informations to the paper, containing selected pair-wise plots of principal components PC1 vs. PC2 together with the corresponding score vs. score displays, are enclosed to the electronic version of the article.These data can be found on the website of Croatica Chemica Acta (http://public.carnet.hr/ccacaa).

Figure 2 .
Figure 2. (a) Structure of the investigated histamine monocation with the labeled H-bonded moieties (HB1-HB4); (b) Distribution of the anharmonic NH stretching transitions (red vertical lines) obtained from proton potentials extracted from the snapshot structures along the CPMD trajectory.A continuous representation of this distribution (black curve) is obtained as a superposition of the Gaussian functions (one for each transition) with a half width of 50 cm −1 ; (c) Simple two-dimensional plots used to obtain preliminary qualitative information about the relationship between anharmonic N-H stretching frequencies and the corresponding Namino•••Owater distances.The calculated standard errors of the slope, interception, and regression assume 92.71, 280.1, and 106.2, respectively.

Figure 3 .
Figure 3. First N exploratory variables (numbered according to the predictor names listed in Table2), which are the most significant predictors for the hydrogen bonds HB1-HB4 (denoted with numbers "1"-"4") in the model with four directly bonded water molecules.Labels "a" and "b" correspond to ClO1 and ClO2 classes of structures, respectively.

Figure 4 .
Figure 4. First N predictors of the HB4 moiety in models with different number of solvent water molecules.

Table 1 .
Matrix dimensions used for the chemometric analysis.The internal coordinate space was formulated by considering different number of water molecules (according to the distance between the nitrogen Namino and the oxygen Owater of bulk water) for the each N-H•••O hydrogen-bonded moiety Euclidean distance Namino•••Owater between the amino nitrogen and oxygen from bulk water.(b)42rows represent snapshot structures composed of 30, 60, 129, 171 and 297 variables (internal coordinates in columns).

Table 3 .
Comparison of the variances(a)in the Principal Component Analysis (PCA) using different matrix sizes Croat.Chem.Acta 87 (2014) 397.