metadatta.

Managing Information on the Web, Wolfram-Style

May 15, 2009 · Leave a Comment

While Steven Wolfram may not the most, um, orthodox figure in the scientific community (see, for example Steven Levy’s bio, or Cosma Shalizi’s review of the modestly-titled A New Kind of Science), I don’t think anyone doubts the usefulness of Mathematica and the various things associated with it (e.g. MathWorld and the Demonstrations Project). And now apparently his latest production – WolframAlpha, Wolfram’s new Mathematica-based search engine – will be released to the public this Monday. It looks quite interesting.

Finding useful information on the internet can be difficult and incredibly annoying, particularly for scientists or anyone in search of statistics of some sort. Google and Wikipedia, while useful, can often be inefficient or yield inadequate results. Many new search engines tailored to various interests seem to have emerged recently, but I am not aware of any current tools that satisfactorily tackle this particular (non-trivial) problem. One solution for anyone interested in biology is bionumbers, a searchable database of useful biological facts and data taken straight from the literature — but I think it’s quite clear that a more general and comprehensive solution (which WolframAlpha purports to be) would be very cool.

Judging from Wolfram’s promo video and reviews on pcworld, techreview and semantic universe, Alpha seems to be bionumbers made significantly more powerful and comprehensive. You probably won’t want to use it over google to find movie times or track your favorite celebrities’ lovelives; but you will want to use it to find various kinds of quantitative information: various metrics of the weather in Springfield, MA on the day David Ortiz was born, the location and sequence of some gene, the flowfield over a particular airfoil, the current position of the International Space Station, or data on blood cholesterol and potassium levels of middle-aged male smokers, for example. I look forward to pushing the limits of this tool, but it looks very useful.

Not be outmatched, Google recently announced plans to implement a similar kind of service using publicly-available data. I’m not sure when they will be releasing it, though, or how it will compare to WolframAlpha.

→ Leave a CommentCategories: Artificial Intelligence · Computing · General · Media · Technology · Websites

Determining Linguistic Structures Using Entropy?

May 2, 2009 · 2 Comments

Here’s an interesting recent controversy. While I know absolutely nothing about the subject, the ideas and questions raised are interesting, so here’s a quick summary of the different opinions.

In one corner – Rao et al.
In a recent high-profile Brevia published in Science a week ago, Rao et al. suggest that

  1. the degree of randomness in linguistic systems is significantly different from that of nonlinguistic systems, and
  2. the degree of randomness of the script of the Indus civilization is similar to that of other linguistic systems – in particular, Sumerian and Old Tamil. (The similarity to Old Tamil is particularly striking because it seems to support the somewhat controversial opinion of some “that the Indus peoples spoke and wrote a Dravidian language” – here I’m quoting from Farmer et al.’s rebuttal.)

Based on this, their claim is that the Indus script encodes some kind of linguistic structure, in stark contradiction to some well-known work by Farmer et al. arguing that the Indus script is “a simple nonlinguistic sign system common in the ancient world”. Unsurprisingly, this has set off a number of critical responses, and it’s always fun to see discussion and debate of this sort go on.

The way Rao et al. quantify the degree of randomness of any given sequence of units, or “tokens” (for example, words or characters in English) is by computing the conditional entropy, a standard measure of randomness in information theory. Simplistically, this quantity is a measure of how flexibly different tokens can be ordered: in a nonlinguistic system where different tokens are ordered at random – what Rao et al. call a “Type 1 nonlinguistic system” – the conditional entropy is high, while in a nonlinguistic system where a given token must be followed by another specific token – a “Type 2 nonlinguistic system” – the conditional entropy is low. Intuitively, it is perhaps not surprising that linguistic systems fall somewhere in between: Rao et al. verify this by computing the conditional entropy for a few different linguistic systems, as well as two synthetic nonlinguistic systems (type 1 and type 2). They use this to support their first claim. Furthermore, they compute the conditional entropy for sequences of signs from the Indus script and – surprise, surprise – find that it falls somewhere in between the type 1 and type 2 nonlinguistic systems, just like the other linguistic systems they studied. They use this to support their second claim.

In the other corner – Farmer et al., Liberman, Pereira, Shalizi, Sproat, and others.
Farmer et al. – whose work Rao et al.’s contradicts – have written a pretty strong response to Rao et al.’s paper. Among other things, Farmer et al. claim that their original work from 2004 “awakened resistance from Indian nationalists and researchers whose entire careers have been linked to the Indus-script thesis, one of whom is listed as coauthor of [Rao et al.'s] study”; and, “if [Rao et al.'s] paper had been properly peer reviewed it would not have been published.” Ouch. Here are their main critiques of this work:

  • Rao et al. used “synthetic” type 1 and type 2 nonlinguistic data in their calculations – that is, they created it according to certain rules. In a sense, these are designed to represent two different extremes on the “conditional entropy spectrum”, and as such it is not surprising that linguistic systems fall somewhere in between. Other nonlinguistic systems might, as well – so, claim #1 is unsubstantiated.
  • The idea that the Indus signs are in some linguistic way related to Old Tamil does not make sense historically: for example, “the first attestation
    of Old Tamil came nearly two thousand years after the Indus civilization disappeared”.

Others have weighed in on this as well, including Mark Liberman, Fernando Pereira, Cosma Shalizi, and Richard Sproat. In particular, Liberman, Shalizi and Sproat have come up with simple counter-examples to Rao et al.’s data, showing instances of nonlinguistic datasets that show at least qualitatively similar behavior to Rao et al.’s linguistic datasets. It appears that at least for now, Pereira’s comment that language is “a system… carrying lots of specific information that cannot be captured by a single statistic” seems to hold.

→ 2 CommentsCategories: Academia · Artificial Intelligence · Interdisciplinary · Mathematics · Models · Papers · Physics · Science · Social Science · Statistics

Type-1.5 superconductors

February 22, 2009 · 2 Comments

Superconductors are generally classified as being type-I or type-II; Doug Natelson touched on this as part of his recent series of pedagogical posts explaining solid-state physics concepts. Type-I superconductors usually do not admit an external magnetic field in the superconducting state: they turn “normal” above a critical value of the field. Type-II superconductors do admit a magnetic field for some field strengths above the critical value, while still being able to superconduct: this is known as the “mixed” state.

In the presence of an externally-applied magnetic field, “vortices” form in a superconductor, with a normal nonsuperconducting core of size ~ \xi (the “coherence length” over which Cooper pairsquasiparticles consisting of pairs of ‘bound’ electrons – are extended). This is surrounded by a region of size ~ \lambda in which a supercurrent circulates (where \lambda is known as the penetration depth), and hence produces its own opposing magnetic field. Forming such a vortex requires some energy; straightforward calculations show that the interfacial energy per unit length associated with a vortex is proportional to \xi^{2}-\lambda^{2} . If \xi > \lambda , forming such vortices increases the free energy of the system, and vortices tend to attract and annihilate – as in the case of type-I superconductors. If \xi < \lambda , on the other hand, a “lattice” of repulsive vortices is formed – as in the case of the mixed state of type-II superconductors.

In some materials, electrons can exist in two different bands (\pi and \sigma ), reflecting different kinds of bonding. A classic example of this is graphite. The electrons in the highest occupied states in a structurally similar superconductor, MgB2, are similarly \pi - or \sigma -bonded. This can be thought of as resulting in two different kinds of Cooper pairs with two different values of \xi and \lambda . The interesting thing is that in MgB2, the quasiparticles associated with the \pi electrons have \xi > \lambda (type-I), while the quasiparticles associated with the \sigma electrons have \xi < \lambda (type-II).

The coupling between these two different states is so weak that MgB2 was predicted - and has now been found – to be a so-called “type-1.5″ superconductor — that is, one with behavior combining aspects of type-I and type-II superconductivity. In this case, the vortices repel each other (as in type-II superconductors) over short distances while they attract each other (as in type-I superconductors) over long distances. In a previous post, I noted that competition between long-range repulsive and short-range attractive forces often leads to spatially inhomogeneous and anisotropic phases in various systems: examples include “stripe” or “bubble” phases in blockcopolymers, “pasta phases” of the crusts of neutron stars or DNA-intercalated lipid bilayers, stripe formation in ferrofluids, and anisotropic phases in two-dimensional electron gases in the presence of moderately large magnetic fields. Similarly, one might expect the competition between short-range repulsive and long-range attractive forces between vortices to give rise to interesting pattern formation in MgB2 at low applied fields.

This is what Moshchalkov et al. set out to explore. One way to visualize the flux vortices of a superconductor is using so-called ‘magnetic decoration’: that is, by sprinkling ferromagnetic powder onto the surface of the superconductor. The powder is then attracted by the vortex flux lines and forms a pattern representative of the flux vortices. Using this technique, Moshchalkov et al. found indeed that the vortices in MgB2 were inhomogeneously distributed, often forming stripes separated by regions of ‘normal’ phase – thus confirming that MgB2 is a type-1.5 superconductor.

→ 2 CommentsCategories: Condensed Matter Physics · Magnetism · Papers · Physics · Science · Superconductivity

Metallic Glasses

December 21, 2008 · 1 Comment

Glasses have received a lot of attention because of their interesting structure and dynamics (indeed, Nobel Laureate Phil Anderson wrote that “The deepest and most interesting unsolved problem in solid state theory is probably the theory of the nature of glass and the glass transition.”) Unlike crystals, they do not possess long-range order — only short or medium-range order, like liquids. Unlike liquids, however, glasses have mechanical properties akin to those of solids. A number of different approaches have been explored to study the physics of glasses, including harnessing the technology of colloidal physics as many in our and other research groups do. Metallic glasses are also model glassy systems, formed when a molten liquid precursor is supercooled at a rate fast enough that glass formation wins over crystal formation.

Here’s a quick description of two recent papers that look at two different aspects of metallic glasses…

1. How easy is it to form a glass? (Li et al., Science 2008)
Intuitively, one might expect that if the liquid precursor to a metallic glass has higher packing density, the atomic subunits making up the liquid have less “free” volume to explore and hence have a lower probability of forming an ordered cluster to nucleate crystal formation: the glass is “easier” to form. (Note that because the density at the glass transition is continuous, unlike the transition between a liquid and crystal, the density of the liquid precursor and the density of the newly-formed glass are the same thing).

Surprisingly, it seems there is very little clear experimental demonstration of this correlation between the density of a metallic glass and the ease with which it is formed. In this paper, Li et al. show a very nice route towards this. They use microfabrication to produce an array of silicon nitride cantilevers, sputter-coated with differing compositions of the binary alloy CuxZr1-x, a popular system for studying metallic glasses. They measure the density difference between the as-deposited glass and the crystal that results from thermal annealing their samples by measuring the deflection of the cantilevers before and after annealing (the density of the resulting crystalline state can be estimated using equilibrium thermodynamics). On the other hand, they measure the ease of glass formation for glasses of differing composition by casting them in wedge-shaped molds; a cross-section of the resulting solid shows a clear boundary between the glassy and the crystalline state, and the thickness of the lower (denser) glassy state is a standard metric for how easily it formed. The beauty of these experiments is that they are quite straightforward, and look at this particular system over a range of relative compositions.

Sure enough, Li et al. see very nice correlation between the glass/crystal density difference and the ease of glass formation over the range of compositions they study. Interestingly, three particular compositions seem to form the glassy state very easily — and surprisingly, only one can be predicted using existing models!

2. What is the medium-range structure of a glass? (Ma et al., Nature Materials 2008)
A good deal of work has focused on understanding the microscopic nature of short-range – that is, on the lengthscale of just a few atoms – order (SRO) in metallic glasses (e.g. Miracle, 2004). A relatively recent model, which is accumulating more and more experimental support, is that alloyed metallic glass are composed of small clusters of majority atoms surrounding a minority atom “seed”. If one is willing to believe this model, the next question is: how do we use this understanding to better understand the nature of medium-range order (MRO) in metallic glasses? It has been suggested that these clusters may closely pack to form the metallic glass. In this paper, Ma et al. suggest another idea: these clusters form a fractal network of dimension 2.31.

The evidence Ma et al. compile to support this notion is compelling. For starters, fractal networks are ubiquitous in materials of interesting microstructure (e.g. see the references in Ma et al.’s paper). One close example is quasicrystals, which also lack translational symmetry, and have been shown to also be described as fractal networks. Second, Raman and neutron-scattering experiments performed in the 1990s suggested the existence of frequency-dependent vibrational excitations in metallic glasses, with a crossover between the phonons that are characteristic of ordered crystals and “fractons”, vibrational excitations of a fractal network. (This is the first time I come across the idea of a “fracton”, and I will have to spend some time rigorizing how I think about them.) In this paper, Ma et al. present their own and others’ neutron and X-ray diffraction data of a number of metallic glasses of different compositions (including the CuxZr1-x mentioned in the previous paper).

In crystalline materials, the momentum-space position of the first Bragg peak in a powder diffraction pattern (q_{1} ) is inversely proportional to the largest distance between two atomic planes of the sample — small q ’s probe large lengthscales. Representing the atoms making up the crystal as hard spheres of volume v_{a} , this distance scales as v_{a}^{1/3} (that is, q_{1}\cdot v_{a}^{1/3}\sim constant). The key idea in Ma et al.’s paper is that while metallic glasses do not have well-defined Bragg peaks because of their disordered structure, the medium-range order does give rise to a few diffuse scattering “haloes”. Thinking about the atoms of the metallic glass as hard spheres as well, on expects that q_{1}\cdot v_{a}^{1/D}\sim constant, where v_{a}= mass density/(avogadro’s number * molecular weight) and D is the fractal dimension of the network making up the metallic glass. Strikingly, they see this kind of scaling behavior, with D=2.31 . Further analysis of the atomic pair distribution function of their samples (essentially, a measure of how correlated atoms at different distances from each other are) supports this notion of a fractal network over medium-range length scales. It’ll be interesting to see how future work builds on this idea. I’m a bit confused as to what the “atomic volume” as calculated in Ma et al.’s paper physically represents in these alloyed metallic glasses, something the authors don’t go into too much detail on. Naively I would guess this is somehow related to the size of the clusters making up the fractal network — perhaps it would be interesting to use this kind of data to pull out this information and see if it agrees with other work on the structure of these SRO clusters.

→ 1 CommentCategories: Condensed Matter Physics · Papers · Physics · Science

Particles in Fluids

October 29, 2008 · 1 Comment

I’m back – posts will be much shorter and more paper-centered from now on, as classes and research continue to consume my life.

Three really cool papers recently, all dealing with particles in some kind of flow:

1. Effects of particles in chaotic flow (Ouellette et al., PRL 2008)
Small tracer particles are often used to ‘visualize’ fluid flows, by seeding them into the fluid. If the particles are small enough and have low enough density to match the fluid, they can be considered as infinitesimal fluid elements to a good approximation. This breaks down if the particles are (i) too large, or (ii) too dense. While the effect of having small but dense particles is pretty well studied (since the particles can be taken to be pointlike), the case of large particles is more complicated – one has to solve the relevant Navier-Stokes equation over the surface of each particle. How does a large tracer particle perturb fluid flow?

By imaging the motion of tracer particles of different sizes in a chaotic fluid flow, Ouellette et al. study the flow field around a large tracer particle as well as its own motion. (The smallest particles act as the ‘ideal’ infinitesimal fluid elements that follow the flow well.) The effect of tracer particles being too large or too dense is often thought to be captured by the Stokes number St\sim(\rho_{p}/\rho_{f})(a/L)^{2}\cdot Re where \rho_{p} and \rho_{f} are the particle and fluid densities, a and L are the particle radius and characteristic flow length scale, and Re is the fluid Reynolds number. It is surprising, then, that the data in these experiments does not seem to solely depend on the Stokes or Reynolds numbers – these dimensionless parameters don’t appear to capture all of the physics associated with inertial effects. Weird.

2. Effects of particles in turbulent flow (Tanaka and Eaton, PRL 2008)
Ok, so the previous paper dealt with non-turbulent flow. This one deals with the case of how particles in a turbulent flow affect the turbulence. Do they make it more turbulent, make it less turbulent, or (unlikely) don’t affect the flow? Can these effects be captured by the Stokes or Reynolds number, unlike the previous case?

Tanaka and Eaton looked at data from many different experiments on this subject, finding (as in Ouellette et al.’s experiments) no systematic dependence on the Stokes or Reynolds numbers. Hm. Instead, they use some very beautiful dimensional analysis to come up with a new dimensionless parameter, what they call the particle momentum number Pa, which seems to capture more of the physics here – for very large and very small values of Pa, the particles augment turbulence, while for an intermediate range of Pa turbulence is attenuated. (Instead of attempting to write the two forms of Pa out, I’m just going to refer the reader to equations 14 and 15 in the paper). This is cool – finally a parameter that yields information about the physics of the situation!

Physically, is there a simple way of seeing what Pa actually means, versus just being a combination of Re, St, and various relevant variables? (The Reynolds number, for example, can be understood as telling one about the relative importance of inertial forces versus viscous forces on a tracer particle; the Stokes number on the other hand tells one about how ‘impactable’ a tracer particle is – it describes how independently the particle can move from the carrier flow.) I wasn’t fully able to decipher this.

Secondly, and I’m not sure if this even makes sense or not, but could this have any relevance to Ouellette et al.’s experiments, in which St or Re on their own were not enough to account for the effects of perturbations due to tracer particles? I did some mindless playing around with Ouellette et al.’s data from figure 4c-d, plotting it as a function of two such possible parameters. The first, ‘Pa1′, is inspired by Tanaka and Eaton’s particle momentum number, and is defined as Pa1=Re^{-1/4}\cdot St ; the second, ‘Pa2′, is Tanaka and Eaton’s equation 15: Pa2=(1/54\surd2)(Re^{2}/\surd St)(\rho_{p}/\rho_{f})^{3/2}(2a/L)^{3} . This is what I’m showing here:

Perhaps unsurprisingly, the two curves (for two tracer particle sizes) still don’t fall on a single curve. Oh well. Again, I’m not entirely sure if it makes sense to ask this question, but is there some combination of St and Re (similar to Pa) that is a more relevant dimensionless parameter for Ouellette et al.’s experiments?

3. Phonons in a 1D microfluidic ‘crystal’ (Beatus et al., Nature Physics 2008)
This is a cute paper that touches on many, many interesting ideas. The basic idea is straightforward: Beatus et al. produced a continuously-flowing array of uniformly-spaced oil drops in a microfluidic channel, surrounded by a continuous oil phase. The drops are disc-like in shape (they are confined in the z-direction), unconstrained in the x-direction (the direction of flow), and the constraint in y (i.e. the width of the channel) is varied, thus varying the friction on the drops.

The cool thing is that these researchers see interesting longitudinal and transverse fluctuations (it’s worth looking at the supplementary movies), and by fourier-transforming their data, they pull out dispersion relations that surprisingly show acoustic phonon propagation. The phonon propagation speed is much smaller than the speed of sound in the surrounding fluid, which leads them to hypothesize that these collective modes arise from dipole-like hydrodynamic interactions between droplets. Very pretty stuff.

→ 1 CommentCategories: Condensed Matter Physics · Fluid Dynamics · Mathematics · Papers · Physics · Science