The estimation problem of stochastic nonlinear parametric models is recognized to be very challenging due to the intractability of the likelihood function. Recently, several methods have been developed to approximate the maximum likelihood estimator and the optimal mean-square error predictor using Monte Carlo methods. Albeit asymptotically optimal, these methods come with several computational challenges and fundamental limitations.
The contributions of this thesis can be divided into two main parts. In the first part, approximate solutions to the maximum likelihood problem are explored. Both analytical and numerical approaches, based on the expectation-maximization algorithm and the quasi-Newton algorithm, are considered. While analytic approximations are difficult to analyze, asymptotic guarantees can be established for methods based on Monte Carlo approximations. Yet, Monte Carlo methods come with their own computational difficulties; sampling in high-dimensional spaces requires an efficient proposal distribution to reduce the number of required samples to a reasonable value.
In the second part, relatively simple prediction error method estimators are proposed. They are based on non-stationary one-step ahead predictors which are linear in the observed outputs, but are nonlinear in the (assumed known) input. These predictors rely only on the first two moments of the model and the computation of the likelihood function is not required. Consequently, the resulting estimators are defined via analytically tractable objective functions in several relevant cases. It is shown that, under mild assumptions, the estimators are consistent and asymptotically normal. In cases where the first two moments are analytically intractable due to the complexity of the model, it is possible to resort to vanilla Monte Carlo approximations. Several numerical examples demonstrate a good performance of the suggested estimators in several cases that are usually considered challenging.
Based on (448.1±2.9)×106 ψ(3686) events collected with the BESIII detector operating at the BEPCII collider, the decay ψ(3686)→ϕK0SK0S is observed for the first time. Taking the interference between ψ(3686) decay and continuum production into account, the branching fraction of this decay is measured to be B(ψ(3686)→ϕK0SK0S)=(3.53±0.20±0.21)×10−5, where the first uncertainty is statistical and the second is systematic. Combining with the world average value for B(J/ψ→ϕK0SK0S), the ratio B(ψ(3686)→ϕK0SK0S)/B(J/ψ→ϕK0SK0S) is determined to be (6.0±1.6)%, which is suppressed relative to the 12% rule.
This paper aims to perform a comparison of deterministic and stochastic models. The stochastic modelling is a more realistic way to study the dynamics of gonorrhoea infection as compared to its corresponding deterministic model. Also, the deterministic solution is itself mean of the stochastic solution of the model. For numerical analysis, first, we developed some explicit stochastic methods, but unfortunately, they do not remain consistent in certain situations. Then we proposed an implicitly driven explicit method for stochastic heavy alcohol epidemic model. The proposed method is independent of the choice of parameters and behaves well in all scenarios. So, some theorems and simulations are presented in support of the article.
Hypothesis testing has long been a formal and standardized process. Hypothesis generation, on the other hand, remains largely informal. This thesis assess whether eXplainable AI (XAI) can aid in the standardization of hypothesis generation through its utilization as a hypothesis generating tool for medical research. We produce XAI heat maps for a Convolutional Neural Network (CNN) trained to classify Microsatellite Instability (MSI) in colon and gastric cancer with four different XAI methods: Guided Backpropagation, VarGrad, Grad-CAM and Sobol Attribution. We then compare these heat maps with pathology annotations in order to look for differences to turn into new hypotheses. Our CNN successfully generates non-random XAI heat maps whilst achieving a validation accuracy of 85% and a validation AUC of 93% – as compared to others who achieve a AUC of 87%. Our results conclude that Guided Backpropagation and VarGrad are better at explaining high-level image features whereas Grad-CAM and Sobol Attribution are better at explaining low-level ones. This makes the two groups of XAI methods good complements to each other. Images of Microsatellite Insta- bility (MSI) with high differentiation are more difficult to analyse regardless of which XAI is used, probably due to exhibiting less regularity. Regardless of this drawback, our assessment is that XAI can be used as a useful hypotheses generating tool for research in medicine. Our results indicate that our CNN utilizes the same features as our basic pathology annotations when classifying MSI – with some additional features of basic pathology missing – features which we successfully are able to generate new hypotheses with.
GDP is used to measure the economic state of a country and accurate forecasts of it is therefore important. Using the Economic Tendency Survey we investigate forecasting quarterly GDP growth using the data mining technique Random Forest. Comparisons are made with a benchmark AR(1) and an ad hoc linear model built on the most important variables suggested by the Random Forest. Evaluation by forecasting shows that the Random Forest makes the most accurate forecast supporting the theory that there are benefits to using Random Forests on economic time series.
We study the growth of two competing infection types on graphs generated by the configuration model with a given degree sequence. Starting from two vertices chosen uniformly at random, the infection types spread via the edges in the graph in that an uninfected vertex becomes type 1 (2) infected at rate lambda(1) (lambda(2)) times the number of nearest neighbors of type 1 (2). Assuming (essentially) that the degree of a randomly chosen vertex has finite second moment, we show that if lambda(1) = lambda(2), then the fraction of vertices that are ultimately infected by type 1 converges to a continuous random variable V is an element of (0,1), as the number of vertices tends to infinity. Both infection types hence occupy a positive (random) fraction of the vertices. If lambda(1) not equal lambda(2), on the other hand, then the type with the larger intensity occupies all but a vanishing fraction of the vertices. Our results apply also to a uniformly chosen simple graph with the given degree sequence.
We study a model of competition between two types evolving as branching random walks on Z(d). The two types are represented by red and blue balls, respectively, with the rule that balls of different colour annihilate upon contact. We consider initial configurations in which the sites of Z(d) contain one ball each which are independently coloured red with probability p and blue otherwise. We address the question of fixation, referring to the sites and eventually settling for a given colour or not. Under a mild moment condition on the branching rule, we prove that the process will fixate almost surely for p not equal 1/2 and that every site will change colour infinitely often almost surely for the balanced initial condition p = 1/2.
We study survival among two competing types in two settings: a planar growth model related to two-neighbor bootstrap percolation, and a system of urns with graph-based interactions. In the planar growth model, uncolored sites are given a color at rate 0, 1 or infinity, depending on whether they have zero, one, or at least two neighbors of that color. In the urn scheme, each vertex of a graph G has an associated urn containing some number of either blue or red balls ( but not both). At each time step, a ball is chosen uniformly at random from all those currently present in the system, a ball of the same color is added to each neighboring urn, and balls in the same urn but of different colors annihilate on a one-for-one basis. We show that, for every connected graph G and every initial configuration, only one color survives almost surely. As a corollary, we deduce that in the two-type growth model on Z(2), one of the colors only infects a finite number of sites with probability one. We also discuss generalizations to higher dimensions and multi-type processes, and list a number of open problems and conjectures.
Consider a monotone Boolean function f : {0, 1}(n) -> {0, 1} and the canonical monotone coupling {eta(p) : p is an element of [0, 1]} of an element in {0, 1}(n) chosen according to product measure with intensity p is an element of [0, 1]. The random point p is an element of [0, 1] where f (eta(p)) flips from 0 to 1 is often concentrated near a particular point, thus exhibiting a threshold phenomenon. For a sequence of such Boolean functions, we peer closely into this threshold window and consider, for large n, the limiting distribution (properly normalized to be nondegenerate) of this random point where the Boolean function switches from being 0 to 1. We determine this distribution for a number of the Boolean functions which are typically studied and pay particular attention to the functions corresponding to iterated majority and percolation crossings. It turns out that these limiting distributions have quite varying behavior. In fact, we show that any nondegenerate probability measure on R arises in this way for some sequence of Boolean functions.
We study the phase transition of random radii Poisson Boolean percolation: Around each point of a planar Poisson point process, we draw a disc of random radius, independently for each point. The behavior of this process is well understood when the radii are uniformly bounded from above. In this article, we investigate this process for unbounded (and possibly heavy tailed) radii distributions. Under mild assumptions on the radius distribution, we show that both the vacant and occupied sets undergo a phase transition at the same critical parameter.c. Moreover, For. <.c, the vacant set has a unique unbounded connected component and we give precise bounds on the one-arm probability for the occupied set, depending on the radius distribution. At criticality, we establish the box-crossing property, implying that no unbounded component can be found, neither in the occupied nor the vacant sets. We provide a polynomial decay for the probability of the one-arm events, under sharp conditions on the distribution of the radius. For. >.c, the occupied set has a unique unbounded component and we prove that the one-arm probability for the vacant decays exponentially fast. The techniques we develop in this article can be applied to other models such as the Poisson Voronoi and confetti percolation.
We study a variant of Gilbert's disc model, in which discs are positioned at the points of a Poisson process in R-2 with radii determined by an underlying stationary and ergodic random field phi: R-2 -> [0, infinity), independent of the Poisson process. This setting, in which the random field is independent of the point process, is often referred to as geostatistical marking. We examine how typical properties of interest in stochastic geometry and percolation theory, such as coverage probabilities and the existence of long-range connections, differ between Gilbert's model with radii given by some random field and Gilbert's model with radii assigned independently, but with the same marginal distribution. Among our main observations we find that complete coverage of R(2 )does not necessarily happen simultaneously, and that the spatial dependence induced by the random field may both increase as well as decrease the critical threshold for percolation.
Unsupervised word embedding methods are frequently used for natural language processing applications. However, the unsupervised methods overlook known lexical relations that can be of value to capture accurate semantic word relations. This thesis aims to explore if Swedish word embeddings can benefit from prior known linguistic information. Four knowledge graphs extracted from Svenska Akademiens ordlista (SAOL) are incorporated during the training process using the Probabilistic Word Embeddings with Laplacian Priors (PELP) model. The four implemented PELP models are compared with baseline results to evaluate the use of side information. The results suggest that various lexical relations in SAOL are of interest to generate more accurate Swedish word embeddings.
A test statistic for homogeneity of two or more covariance matrices is presented when the distributions may be non-normal and the dimension may exceed the sample size. Using the Frobenius norm of the difference of null and alternative hypotheses, the statistic is constructed as a linear combination of consistent, location-invariant, estimators of trace functions that constitute the norm. These estimators are defined as U-statistics and the corresponding theory is exploited to derive the normal limit of the statistic under a few mild assumptions as both sample size and dimension grow large. Simulations are used to assess the accuracy of the statistic.
The RV coefficient is an important measure of linear dependence between two multivariate data vectors. Using unbiased and computationally efficient estimators of its components, a modification to the RV coefficient is proposed, and used to construct a test of significance for the true coefficient. The modified estimator improves the accuracy of the original and, along with the test, can be applied to data with arbitrarily large dimensions, possibly exceeding the sample size, and the underlying distribution need only have finite fourth moment. Exact and asymptotic properties are studied under fairly general conditions. The accuracy of the modified estimator and the test is shown through simulations under a variety of parameter settings. In comparisons against several existing methods, both the proposed estimator and the test exhibit similar performance to the distance correlation. Several real data applications are also provided.
A unified testing framework is presented for large-dimensional mean vectors of one or several populations which may be non-normal with unequal covariance matrices. Beginning with one-sample case, the construction of tests, underlying assumptions and asymptotic theory, is systematically extended to multi-sample case. Tests are defined in terms of U-statistics-based consistent estimators, and their limits are derived under a few mild assumptions. Accuracy of the tests is shown through simulations. Real data applications, including a five-sample unbalanced MANOVA analysis on count data, are also given.
Tests of zero correlation between two or more vectors with large dimension, possibly largerthan the sample size, are considered when the data may not necessarily follow a normal distribution. A single sample case for several vectors is rst proposed, which is then extended tothe common covariance matrix under the assumption of homogeneity across several independentpopulations. The test statistics are constructed using a recently proposed modicationof the RV coecient for high-dimensional vectors. The accuracy of the tests is shown through simulations.
Test statistics for homogeneity, sphericity and identity of high-dimensional covariance matrices are presented under a wide variety of very general conditions when the dimension of the vector, $p$, may exceed the sample size, $n_i$, $i = 1, \ldots, g$. First, location-invariant tests are presented under normality assumption, followed by their robustness to normality by replacing the normality assumption with a mild alternative multivariate model. The two types of tests are then presented in non-invariant form, again under normality and non-normality. Tests of homogeneity of covariance matrices in all cases are immediately supplemented by the tests for sphericity and identity of the common covariance matrix under the null hypothesis. Both location-invariant and non-invariant tests are composed of estimators that are defined as $U$-statistics with kernels of different degrees. Hence, the asymptotic theory of $U$-statistics is employed to arrive at the limiting null and alternative distributions of tests for all cases. These limit distributions are derived using a very mild and practically viable set of assumptions mainly on the traces of the unknown covariance matrices. Finally, corrections and improvements of a few other tests are also presented.
For two or more multivariate distributions with common covariance matrix, test statistics for certain special structures of the common covariance matrix are presented when the dimension of the multivariate vectors may exceed the number of such vectors. The test statistics are constructed as functions of location-invariant estimators defined as U-statistics, and the corresponding asymptotic theory is used to derive the limiting distributions of the proposed tests. The properties of the test statistics are established under mild and practical assumptions, and the same are numerically demonstrated using simulation results with small or moderate sample sizes and large dimensions.
A test statistic for homogeneity of two or more covariance matrices of large dimensions is presented when the data are multivariate normal. The statistic is location-invariant and defined as a function of U-statistics of non-degenerate kernels so that the corresponding asymptotic theory is employed to derive the limiting normal distribution of the test under a few mild and practical assumptions. Accuracy of the test is shown through simulations with different parameter settings.
Multiple comparisons for two or more mean vectors are considered when the dimension of the vectors may exceed the sample size, the design may be unbalanced, populations need not be normal, and the true covariance matrices may be unequal. Pairwise comparisons, including comparisons with a control, and their linear combinations are considered. Under fairly general conditions, the asymptotic multivariate distribution of the vector of test statistics is derived whose quantiles can be used in multiple testing. Simulations are used to show the accuracy of the tests. Real data applications are also demonstrated.
Tests for certain covariance structures, including sphericity, are presented when the data may be high-dimensional but not necessarily normal. The tests are formulated as functions of location-invariant estimators defined as U-statistics of higher order kernels. Under a few mild assumptions, the limit distributions of the tests are shown to be normal. The accuracy of the tests is demonstrated by simulations.
A test for homogeneity of g 2 covariance matrices is presented when the dimension, p, may exceed the sample size, n(i), i = 1, ..., g, and the populations may not be normal. Under some mild assumptions on covariance matrices, the asymptotic distribution of the test is shown to be normal when n(i), p . Under the null hypothesis, the test is extended for common covariance matrix to be of a specified structure, including sphericity. Theory of U-statistics is employed in constructing the tests and deriving their limits. Simulations are used to show the accuracy of tests.
Given a random sample of n iid vectors, each of dimension p and partitioned into b sub- vectors of sizes pi, i = 1;:::;b. Location-invariant and non-invariant test statistics for independence of sub-vectors are presented when pi may exceed n and the distribution need not be normal. The tests are composed of U -statistics based estimators of the Frobenius norm of the di erence between the null and alternative hypotheses. Asymptotic distributions of the tests are provided for n;pi! 1, where their nite-sample performance is demonstrated through simulations. Some related and subsequent tests are brie y described. Relations of the proposed tests to certain multivariate measures are discussed, which are of interest on their own.
A test for proportionality of two covariance matrices with large dimension, possibly larger than the sample size, is proposed. The test statistic is simple, computationally efficient, and can be used for a large class of multivariate distributions including normality. The properties of the statistic, including asymptotic distribution, are given under high-dimensional set up. Through simulations, the statistic is shown to perform accurately, and outperform its recent competitors, constructed on the basis of similar principles. An extension to the multi-sample case is given.
Tests of zero correlation between two or more vectors with large dimension, possibly larger than the sample size, are considered when the data may not necessarily follow a normal distribution. A single-sample case for several vectors is first proposed, which is then extended to the common covariance matrix under the assumption of homogeneity across several independent populations. The test statistics are constructed using a recently proposed modification of the RV coefficient (a correlation coefficient for vector-valued random variables) for high-dimensional vectors. The accuracy of the tests is shown through simulations.
Test statistics are presented for general linear hypotheses, with special focus on the two-sample profile analysis. The statistics are a modification to the classical Hotelling’s T^{2} statistic, are basically designed for the case when the dimension, p, may exceed the sample sizes, n_{i}, and are valid under the violation of any assumption associated with T^{2}, such as normality, homoscedasticity, or equal sample sizes. Under a few mild assumptions replacing the classical ones, the test statistics are shown to follow a normal limit under both the null and alternative hypothesis. As the test statistics are defined as a linear combination of U-statistics, the limits are correspondingly obtained using the asymptotic theory of degenerate (for null) and nondegenerate (for alternative) U-statistics. Simulation results, under a variety of parameter settings, are used to show the accuracy and robustness of the test statistics. Practical application of the tests is also illustrated using a few real data sets.
A modification to the asymptotic distribution of the T-2-statistic used in multivariate process monitoring is provided when the dimension of the vectors may exceed the sample size. Under certain mild condition, a unified limit distribution is obtained that is applicable for both Phase I and II charts. Further the limit holds for charts based on individual observations as well as subgroup means. The limit is easily applicable and does not need any data preprocessing or dimension reduction. Simulations are used to demonstrate the accuracy of the proposed limit.
For a random sample of n iid p-dimensional vectors, each partitioned into b sub-vectors of dimensions p_{i}, i=1,…,b, tests for zero correlation of sub-vectors are presented when p_{i} ≫ n and the distribution need not be normal. The test statistics are composed of U-statistics based estimators of the Frobenius norm measuring the distance between the null and alternative hypotheses. Asymptotic distributions of the tests are provided for n,p_{i }→ ∞, with their finite-sample performance demonstrated through simulations. Some related tests are discussed. A real data application is also given.
A classifier for two or more samples is proposed when the data are high-dimensional and the distributions may be non-normal. The classifier is constructed as a linear combination of two easily computable and interpretable components, the U-component and the P-component. The U-component is a linear combination of U-statistics of bilinear forms of pairwise distinct vectors from independent samples. The P-component, the discriminant score, is a function of the projection of the U-component on the observation to be classified. Together, the two components constitute an inherently bias-adjusted classifier valid for high-dimensional data. The classifier is linear but its linearity does not rest on the assumption of homoscedasticity. Properties of the classifier and its normal limit are given under mild conditions. Misclassification errors and asymptotic properties of their empirical counterparts are discussed. Simulation results are used to show the accuracy of the proposed classifier for small or moderate sample sizes and large dimensions. Applications involving real data sets are also included.
Test statistics for sphericity and identity of the covariance matrix are presented, when the data are multivariate normal and the dimension, p, can exceed the sample size, n. Under certain mild conditions mainly on the traces of the unknown covariance matrix, and using the asymptotic theory of U-statistics, the test statistics are shown to follow an approximate normal distribution for large p, also when p >> n. The accuracy of the statistics is shown through simulation results, particularly emphasizing the case when p can be much larger than n. A real data set is used to illustrate the application of the proposed test statistics.
Ahmad et al. (in press) presented test statistics for sphericity and identity of the covariance matrix of a multivariate normal distribution when the dimension, p, exceeds the sample size, n. In this note, we show that their statistics are robust to normality assumption, when normality is replaced with certain mild assumptions on the traces of the covariance matrix. Under such assumptions, the test statistics are shown to follow the same asymptotic normal distribution as under normality for large p, also whenp >> n. The asymptotic normality is proved using the theory of U-statistics, and is based on very general conditions, particularly avoiding any relationship between n and p.
This thesis investigates the relative merits and drawbacks of six portfolio weighting techniques, including two traditional (equal and value-weighting), two optimization-based (mean-variance and risk parity), and two statistical (principal component analysis and ridge regression) techniques. The focus is thus on selecting the weighting technique, an aspect of portfolio construction that market practitioners sometimes overlook. The analysis, implemented on the Swedish market, employs historical backtesting, Monte Carlo simulations, and stress tests to evaluate various techniques under diverse market conditions. The results reveal that no single portfolio consistently outperforms or underperforms across all metrics and scenarios, highlighting the importance of a comprehensive set of performance and risk measures for informed investment decisions. Furthermore, the statistical techniques, principal component analysis and ridge regression demonstrate competitive risk-adjusted returns relative to the traditional and optimization-based techniques implemented in this thesis. The results suggest that market practitioners should consider incorporating these somewhat uncommon techniques alongside more traditional techniques in portfolio management, depending on their investment objectives and risk tolerance.
In this thesis, the aim is to investigate whether a pairs trading strategy on Swedish stocks can generate a higher risk-adjusted return compared to a buy-and-hold strategy on a benchmark index. The benchmark index is the OMX Stockholm Benchmark-index (OMXSBPI), which is an index that should reflect the Swedish market in general. With a statistical focus, a trading algorithm is built which is then evaluated on data between the years 2018 to 2021. The statistical concepts this thesis is based on are stationarity and cointegration and it is the Augmented Dickey-Fuller test that forms the basis for being able to test these concepts. The risk-adjusted return for the strategy is evaluated using the popular measure Sharpe ratio, which is then compared to the Sharpe ratio for the OMXSBPI-index. The results obtained in this study can not confirm that the pairs trading strategy is better than a buy-and-hold strategy on the OMXSBPI-index in terms of risk-adjusted return. One indication, however, is that the strategy seems to perform better in conditions when the market is declining. In 2018, the index went down by 7.7060 while the strategy went up by 7.5100 percent. As it is data for only one year, it is not possible to determine whether it is due to chance or a potential edge of the strategy.
In this short paper we prove a quantitative version of the Khintchine-Groshev Theorem with congruence conditions. Our argument relies on a classical argument of Schmidt on counting generic lattice points, which in turn relies on a certain variance bound on the space of lattices.
We investigate the number of permutations that occur in random labellings of trees. This is a generalisation of the number of subpermutations occurring in a random permutation. It also generalises some recent results on the number of inversions in randomly labelled trees (Cai et al. in Combin Probab Comput 28(3):335-364, 2019). We consider complete binary trees as well as random split trees a large class of random trees of logarithmic height introduced by Devroye (SIAM J Comput 28(2):409-432, 1998. 10.1137/s0097539795283954). Split trees consist of nodes (bags) which can contain balls and are generated by a random trickle down process of balls through the nodes. For complete binary trees we show that asymptotically the cumulants of the number of occurrences of a fixed permutation in the random node labelling have explicit formulas. Our other main theorem is to show that for a random split tree, with probability tending to one as the number of balls increases, the cumulants of the number of occurrences are asymptotically an explicit parameter of the split tree. For the proof of the second theorem we show some results on the number of embeddings of digraphs into split trees which may be of independent interest.
The use of statistical classification techniques in classifying loan applications into good loans and bad loans gained importance with the exponential increase in the demand for credit. It is paramount to use a classification technique with a high predictive capacity to ensure the profitability of the business venture.
In this study we aim to compare the predictive capability of three classification techniques: 1) Logistic regression, 2) CART, and 3) random forests. We apply these techniques on German credit data using an 80:20 learning:test split, and compare the performance of the models fitted using the three classification techniques. The probability of default p_{i} for each observation in the test set is calculated using the models fitted on the training dataset. Each test set sample x_{i} is then classified into a good loan or a bad loan, based on a threshold , such that x_{i} bad loan class if p_{i }> . We chose several thresholds in order to compare the performance of each of the three classification techniques on five model suitability statistics: Accuracy, precision, negative predictive value, recall, and specificity.
None of the classifiers turned out to be best at all the five cross-validation statistics. However, logistic regression has the best performance at low probability of default thresholds. On the other hand, for higher thresholds, CART performs best in accuracy, precision, and specificity measures, while random forest performs best for negative predictive value and recall measures.
Given a Poisson process in two or three dimensions we are interested in the scan statistic, i.e. the largest number of points contained in a translate of a fixed scanning set restricted to lie inside a rectangular area.
The distribution of the scan statistic is accurately approximated for rectangular scanning sets, using a technique that is also extended to higher dimensions.
The accuracy of the approximation is checked through simulation.
For $n\ge0$, let $\lambda_n$ be the median of the $\Gamma(n+1,1)$ distribution. We prove that the sequence $\{\alpha_n=\lambda_n-n\}$ decreases from $\log 2$ to $2/3$ as $n$ increases from 0 to $\infty$. The difference, $1-\alpha_n$, between the mean and the median thus increases from $1-\log 2$ to $1/3$.
This result also proves the following conjecture by Chen \& Rubin about the Poisson distributions: Let $Y_{\mu}\sim\text{Poisson}(\mu)$, and \lambda_n$ be the largest $\mu$ such that $P(Y_{\mu}\le n)=1/2$, then $\lambda_n-n$ is decreasing in $n$.
The sequence $\{\alpha_n\}$ is related to a sequence $\{\theta_n\}$, introduced by Ramanujan, which is known to be decreasing and of the form
$\theta_n=\frac13+\frac4{135(n+k_n)}$, where $\frac2{21}<k_n\le\frac8{45}$. We also show that the sequence $\{k_n\}$ is decreasing.
The usual definition of average degree for a non-regular lattice has the disadvantage that it takes the same value for many lattices with clearly different connectivity. We introduce an alternative definition of average degree, which better separates different lattices.
These measures are compared on a class of lattices and are analyzed using a Markov chain describing a random walk on the lattice. Using the new measure, we conjecture the order of both the critical probabilities for bond percolation and the connective constants for self-avoiding walks on these lattices.
First passage percolation on is a model for describing the spread of an infection on the sites of the square lattice. The infection is spread via nearest neighbor sites and the time dynamic is specified by random passage times attached to the edges. In this paper, the speed of the growth and the shape of the infected set is studied by aid of large-scale computer simulations, with focus on continuous passage time distributions. It is found that the most important quantity for determining the value of the time constant, which indicates the inverse asymptotic speed of the growth, is , where are i.i.d. passage time variables. The relation is linear for a large class of passage time distributions. Furthermore, the directional time constants are seen to be increasing when moving from the axis towards the diagonal, so that the limiting shape is contained in a circle with radius defined by the speed along the axes. The shape comes closer to the circle for distributions with larger variability.
We study the random graph G (n,p) with a random orientation. For three fixed vertices s, a, b in G(n,p) we study the correlation of the events {a -> s} (there exists a directed path from a to s) and {s -> b}. We prove that asymptotically the correlation is negative for small p, p < C-1/n, where C-1 approximate to 0.3617, positive for C-1/n < p < 2/n and up to p = p(2)(n). Computer aided computations suggest that p(2)(n) = C-2/n, with C-2 approximate to 7.5. We conjecture that the correlation then stays negative for p up to the previously known zero at 1/2; for larger p it is positive.