Search Blogs

Saturday, January 17, 2026

Origins of Morality

Note to Reader

This post discusses views on morality and philosophy that some may find personally challenging. Its best to read as an open discussion and not as a challenge to the reader. Additionally, I'm not formally trained at all in any of these topics and the post was drafted approximately 2 years ago and represents my thinking at that time.

Sometimes I think about life and society in ways that make me really think about deep philosophical questions. One such topic is the origins of morality. A while back (~2 years or so) I had an intriguing discussion with a family member about the origins of morality. The seed for the discussion was the ongoing conflicts in the world and how different groups of people view what is "right" and what is "wrong".

So what was the take? Well, my family member posited that morality must be absolute for humanity to continue to exist. I take the contrary view that morality is subjective (i.e., relative) in all circumstances.

Before moving forward, I need to define what is meant by moral absolutism and moral relativism. You can of course get more details on the exact topic by going through the reference links, but let me first start with moral absolutism.

Moral Absolutism

The main guiding feature of moral absolutism is that certain actions are intrinsically right or wrong, regardless of context or consequences. This means there should be, at minimum, one principle that ought never be violated. An example is unjustified killing of a person. Here the "unjustified" refers to an absolute understanding of what constitutes an unjust cause.

My objection to this is there must be some seed of comprehension that leads to this "absolute understanding", but where does one derive from first principles the "absolute" nature of the morality? It is unclear that such a process is possible without some faith, credence, or belief in a "higher power" or a governing entity. This, however, leads to a dilemma because it is not provable (in terms of logic) that such a seed of truth exists. In my view, all we can assume is that groups of people over time develop ideas and concepts that tend to be beneficial in societies, and thus this acts as the seeds of truth to validate or confirm their self-derived view as "absolute". Religion is one such example of a seed of truth that is used to justify the "absolute" nature of morality. Many groups of humans have developed religions and they converge on some top-level principles that, in my view, happen because the net value in general benefits all the most.

I should add some additional discussion on a related view of "moral universalism", which argues that some moral principles apply universally to all individuals regardless of group or society. Unlike absolutism, universalism allows for some flexibility in application and does not necessarily claim that moral principles are completely context-independent. However, it maintains that there are objective moral truths that hold across all groups of people.

I believe my family member was taking the vantage point of moral absolutism, not universality. Both, in my opinion, are difficult to justify, but universality would require that one takes the sentiment that moral principles apply universally, though with some contextual flexibility. Absolutism, in contrast, simply requires some guiding or seeding principle that certain actions are intrinsically right or wrong. Thus one's axiom could be:

"I believe a higher power exists, and this divine entity mandates that unjust killing is wrong."

There are two problems here that I find:

  1. What if you don't believe in this "higher-power" and moreover you can't prove with logic or observation the existence or divinity of such entity?
  2. Assume you do believe; clearly defining "unjust" in the nuance of every possible circumstance is exceedingly difficult. We do not really know how unjust something is because it's always relative to the context of the situation and previous experiences.

Moral Relativism

In contrast, moral relativism posits that all of morality is subjective and is a matter of lived experiences and environment. If that's the case, how are morals seeded in an individual? How does society birth morality? I argue that this is simply a result of the local social dynamics between humans and their environments.

We clearly see this with the diversity of cultures and religions across the world. What some might view as morally bankrupt, others may view as the commandments of their deity or elders. There is no right or wrong in an absolute sense, but rather in a local, relative sense.

But then you may ask: "Then why is it that most human groups have converged towards the idea that murder1 is fundamentally wrong?"

For which I would argue, there is a natural ranking of emerged morality within human social groups and murder seems to be one that continually occurs. Most likely due to the mutual benefit among members of those groups who maximize their self-objective function (i.e., survival) by agreeing on a set of conditions (i.e., morals/ethics) where one of them is don't murder each other. This usually is codified through religion. An analogy I think of from physics is we see this kind of behavior with, say, spin-glass systems; entropy would drive these systems to have a state of disordered spins (think of murdering each other which gives you advantage because of less competition for resources), but yet due to high energy penalty (i.e., you might get murdered too!) of such configuration, the spins self-arrange into ordered states (i.e., don't murder each other because although I(we) don't benefit as much in short term, I(we) do in the long term because we live longer and have progeny).

My Stance

So it's probably apparent from the tone above that I'm a staunch relativistic moralist. The reason being is, I don't see how you get to absolutism without a "higher-power"2 or governing entity. However, if one goes with absolutism, the debate between an absolutist and a relativist is dead on arrival because the axiomatic difference, i.e., one says god exists, the other does not, and no proof can be given to the other side. As for arguing absolutism through governing physical laws, well it's possible one can argue these laws of morality exist, but again it's hard to see how you get there; your best bet is to try and argue it through physics and complexity theory, but my gosh that is a challenging burden of proof for one to steelman.

Thus I stand by my view that morality is subjective. In the grand scheme of things, if we are lucky, there might be by happenstance a convergence of morals that align well with many groups of people. With globalism this could be more likely; however, with the rise of nationalism it might appear that one group is immoral beyond belief, but in reality it's just a perspective that we're not attuned to. So while many will disagree about what is right and wrong with events going on around the world, it's because of moral relativism and nothing more. It is simply a matter of relative perspective.

Footnotes


  1. Note that murder is specific to intentional malice killing. So killing and murder are not synonyms. 

  2. Here I use "higher power" broadly to refer to any governing entity, principle, or force that is above human comprehension or ability; not necessarily a deity or god. This could include divine entities, but also abstract principles, natural laws, or any authority that transcends individual human understanding or capability. 


Reuse and Attribution

Sunday, January 4, 2026

LCSM Dataset

Early last year I started playing around with the CrystaLLM package, which I've also mentioned in previous posts, to gauge the utility of these generative tools for structure creation. CrystaLLM is a autoregressive model that generates crystal structures by condition and populating a CIF format document [1]. So what it is doing is writing the CIF file given the chemistry and optional spacegroup and/or unit replicate factor. I'm not going to go into the technical details of the architecture of the model and training data here as thats a whole post I need to do on generative AI for structures. The main thing is I used my newish personal Desktop with a RTX 5070Ti to do the inference and ended up with about 7,889 structures that are distinct1. It did take quite some time to configure CrystaLLM and generate the structures2, since I enabled/modified the code to verify that the CIF files were valid and matched the target symmetry spacegroup.

In addition to using CrystaLLM to generate the structures, I decided before hand that I would wrap in a labeling step that would compute the total energy, forces, and stress of the crystal structures. I decided why not use ensemble of pre-trained foundation models that are on matbench discovery to do this. For no particular reason other than ones that I was familiar with, I selected seven foundation models to label each structure. This produced the final dataset of 7,889 structures each labeled by the seven foundation models, yielding a ASE database of 53,749 entries.

Figure 1. Element Distribution

Figure 2. Spacegroup Distribution

The resulting distribution of elements and what fraction associated with binary, ternary, quaternary, and quinary can be see in Figure 1. From my perspective the element and component distributions seem reasonable. I also looked at the spacegroup distribution, as shown in Figure 2, I'm less familiar with what to expect but again seems reasonable that majority are orthorhombic or tetragonal.

For now, due to limited personal time, I decided I would make the dataset available on Zenodo upon request since I don't think I'll have much time on weekends to work on the analysis aspect I was hoping to do with the dataset. Eventually by the end of the year I will create a blog post3 on the dataset and analysis. The main question I was trying to answer is can the foundation model ensemble variance across the pre-trained foundation models serve as a proxy for epistemic uncertainty to identify which unknown/novel CrystaLLM-generated crystal structures are physically legitimate versus incorrect? This would require also considering that the foundation models are trained on mostly the same datasets and therefore systematic biases or shared epistemic limitations might exist within all the models. This means that ensemble agreement could reflect a false positive in epistemic knowledge, potentially limiting the extent to which ensemble variance purely reflects epistemic uncertainty about novel structures.

The zenodo entry, which I call Labeled Synthetic Crystal Material (LSCM) dataset [2], can be found here. If you would like to obtain the dataset, I kindly ask you request it via the zenodo entry and I will be happy to provide access. The dataset is in a ASE sqlite3 format. I'm not providing any guarantees on the quality as this is a raw dataset generated purely by the workflow using CrystaLLM and ASE calculators for foundation models model checkpoints. As to whether I'll add to the dataset in the future, probably not as it ties up my GPU considerably and need to use it for other stuff.

Footnotes


  1. The generated structures are distinct within the dataset, i.e. no replicating chemistries+spacegroup, but I haven't yet checked them against the training datasets and known structures in databases like ICSD or COD

  2. I think I started this running on my personal machine in March 2025 and stopped running things in July 2025, but this was not continuous process, I really only ran things on the weekends. 

  3. If the analysis results turn out to be particularly important and impactful, for example, several generated structures are legit and unknown, then I would probably waver more to writing a formal research paper. This would of course require a lot more time and effort since I would probably have to do some DFT calculations and scour the literature. Could be LLM tools make this feasible to where it actually becomes a viable option for me to do on my own time. 

References

[1] L.M. Antunes, K.T. Butler, R. Grau-Crespo, Crystal structure generation with autoregressive large language modeling, Nature Communications. 15 (2024). https://doi.org/10.1038/s41467-024-54639-7.

[2] S. Bringuier, Labeled Synthetic Crystal Material Dataset, (2026). https://doi.org/10.5281/zenodo.18135201.


Reuse and Attribution

Sunday, December 28, 2025

2025 Reflections: Year of catch-up

The pace at which science and engineering is moving is pretty hard to keep up with. I've always felt things move really fast, or appear that way, because of the real time updates we get through various outlets like arXiv, LinkedIn, and other platforms. The problem is that as a researcher/scientist/engineer/technologist it's pretty hard to keep up in a competent and competitive way. From my perspective, I'm always trying to ensure my understanding and technical ability stays as close as possible with what's coming into the spotlight. This year was no different, but it was really hard to stay on top of all the emerging developments in computational science, self-driving labs, AI for science, materials science, etc.

In terms of the category coverage of my posts this is what the 2025 treemap looks like:

2025 category coverage

Top 5 labels were: Machine Learning, Research, Atomistics, Tools, and AI. The research label must be related to research papers I've read or looked at, not research I'm doing myself2. I'm not surprised about ML and AI as I've had an interest in this more and more over time. The tooling is interesting since most people probably don't think to write about this, but I'm finding this blog a good place to document things related to Linux and Python tooling.

I missed a lot of good stuff this year but I'm going to try to summarize what I thought were some of the most interesting things I came across and wrote about, as well as what I am hoping to catch up on.

Self-driving labs

My hope for this year was to cover the rise and acceleration of self-driving lab research, development, and deployment. There was a lot of good stuff from topics brought up by people like Sterling Baird, who will now lead a new vertical cloud lab at Bringham Young University. In general the field has a lot of good momentum; the challenge will of course be orchestration and integration of different synthesis, characterization, and testing equipment to form self-driving loops that are either fully autonomous or have limited human intervention. I think I only wrote one post in the beginning of the year where I tried to argue that 2025 was the year of robotics in labs. I think this wasn't a bad prediction, as the number of publications on the topic increased by ~54% over 2024 1. So the growth is strong and led by some really capable academics and start-ups in my opinion.

What I'm hoping to see is more open standards by hardware vendors to allow more rapid integration. For robotic manipulation systems, this is probably in a good place and not the major issue, but rather characterization and testing equipment don't have easy IO and primitives to make integrating with decision making loops easy. I could be wrong since I'm not an active researcher or developer in this space but rather just a tinkering enthusiast, but from what I've done, unless you're building your own equipment (i.e. think open source optical microscope), it's going to be hard to fully integrate and automate SEM characterization in a decision making loop.

Materials discovery

This in my opinion is the topic du jour in computational materials science and this is another area where the number of publications is exploding, not so much on actual new materials but rather on the computational tools and techniques to accelerate the discovery and design of new materials. I'm not going to opine so much on whether this is legit or fraught with false promises, for that there are a lot of good people who are opining on that [1]. The main thing is this is here to stay for some time as both big companies, start-ups, and academics are all racing to make their impact in the computational and lab aspects of materials discovery.

I don't think this will slow down at all in 2026; it will probably see more growth, although I don't think it will be so much on a slew of new GNN or transformer model architectures, but probably rather more data and improved integration in workflows. Fine-tuning and distillation of pre-trained models is my guess where a lot of the research action will be.

Computing & simulations

I wrote a fair amount about tooling this past year, I'm not sure why but I think it was mainly as a way to document to myself things I always come back to and am tired of figuring out over and over again. Truth is I tend to refresh my Linux machine fairly often just to keep me from getting stale or too comfortable with forgetting how to do things. I wrote a post on the awesome uv package manager for python. I strongly suggest people check out uv for their python projects. I also wrote about backing up headless Linux machines using rclone and restic/borg/rsnapshot here ... to be honest I've used deduplicated backups several times to recover things I messed up.

On the simulation side I played around with g-xTB [2] and then wrote an ASE wrapper for it and a simple react UI that I wrote a post on. This was a nice little weekend project that I enjoyed doing, though I wish I could have spent more time making it robust and generally useful for scientific work.

I also very much enjoyed the paper on Boron carbide by Ghaffari et al. [3] which I summarized in the post I now know why boron carbide shatters. It just showed the good progress data-driven methods are making in terms of capturing the mechanisms behind material response and behavior.

AI for science

I spent quite a bit of time writing posts on AI for science, as it's just so active and interesting to follow. Most of my posts are improving my understanding and knowledge of the techniques and methods being developed and used. I started off the year by looking at graph-pes since I thought it was one of the cleanest implementations to quickly build and train MLIP models. I haven't played around with it in a while but hoping to find some fun use cases for it in 2026. Related to this was a post mentioning that mace uses the total energy rather than cohesive/binding energies so this subtle point should be kept in mind when using it.

Then I wrote about the ORB v3 MLIP model and my perspective on their equigrad loss regularization technique. I thought this was a really clever/efficient way to learn the rotational equivariance without having to use any SO(3) equivariant layers, which are computationally intensive and architecturally complex. In truth it may not be as necessary if libraries like cuEquivariance and FlashTP make the tensor products faster and more efficient.

Related to MLIPs I wrote a post on synthetic datasets & model distillation and how one can efficiently create synthetic datasets from fine-tuned foundation models to train cheaper student/surrogate models. I actually really think this is the direction things will head once foundation models all start to saturate the benchmarks. We will see most researchers taking them, fine-tuning, and then building more efficient models. I even believe some of these models will be fitted to physics-based potentials rather than just pure data-driven models.

I also wrote about the Skala XC functional and how it was able to achieve near chemical accuracy while doing it at computational cost comparable to meta-GGA functionals. This is probably going to continue getting more attention and improve. As discussed in computing & simulation, the post on g-xTB discussed how semi-empirical methods for molecules are showing DFT level accuracy but at tight-binding computational cost.

More on the ML/AI side I wrote a few posts on some topics I wanted to improve my foundational understanding. One of them was writing a post that was akin to me taking notes based on Petar Veličković's review talk on GNNs. I enjoyed this because it made me feel more confident in my understanding; but as always I need to ensure I truly understand the maths and the implementation. There was also the post on conformal prediction, which was something I had not been familiar with before and after reading and posting on this topic I see the value in this approach. Finally I wrote a post on trying ChatGPT's image generation tool for science illustrations and how it was a bit of a let down, not awful just not good enough for making scientific illustrations that one would want to use in a paper.

I also wrote in the beginning of the year a blog post on Bayesian parameter estimation and how it can be used to estimate the parameters of a line or other simple models. I just like these basic posts even though for many they are elementary topics, but for me it makes me ensure I understand things from basic principles. I say this because it is very easy to get confused in the advanced aspects of these topics if you can't tie it back to the basic things. Additionally, I'm sure I'll come back to posts like this in the future and say "wait, maybe I don't understand" or "what was I saying this is wrong". The beauty of a personal blog is it is not a formal publication or communication (as long as you indicate such), and so mistakes are less embarrassing3.

Books

This was the area that I'm a little down about since I didn't get to reach my reading goal for this year. I read a few books as listed in my reading list, but far from the 10 books I wanted to read. The issue is that many of them are dense technical monographs that take a lot of concentration to sit down and read, whereas the ones I finished were more the kind you pick up and read in a few hours during those random lull moments in my life. For sure this is something I will work on in 2026.

Right now, my reading list target for 2026 that are not technical books is:

I'll be lucky if I get through 2 of these books in 2026, but I'll aim for all as I'm really eager to read these in addition to the technical books I have stacked up 😞.

Statistical mechanics & quantum mechanics

This was a bit off-topic I wrote about but it's just something that throughout the years I feel like I've become less strong in. From 2013-2016 I was in tip-top shape with my knowledge of stat-mech. I wrote towards the end of the year a post on the two roads to thermal equilibrium and how they relate. Then I wrote two posts on quantum related thoughts and topics. I felt like I need to write on this since in 2019-2021 quantum computing was my main research focus. But in recent years it's been off and on and I just feel like I've lost my sharpness in these areas. So I wanted to just write about them either in review of what I had done or just one of the topics to refresh my thinking. I'm sure mistakes were made in the post but at least I tried to put something together.

My hope is to continue on writing about these topics in 2026 to maintain my knowledge and understanding; you never know when this might come back into my main focus and thus good to ensure I'm sharp and knowledgeable about these foundational topics.

Outlook for 2026

So question is what will I try to write about in 2026? Don't know, but I'm hoping some good stuff. For one I will probably try to write posts similar to the ones that discuss papers I read. These posts help me dissect these dense papers and keep a nice log so I can refer back to them later and read something in my own words. I'd like to write some more blogs on weekend coding projects but I just don't get that much time to do this anymore. For example I have some synthetic data workflow that uses CrystaLLM and I'm hoping to write a post on it and post the dataset on Zenodo for others to explore and gauge its usefulness. I'm also hoping to write a little more on thermodynamic computing and of course self-driving labs.

Also I'm going to try to write/code a bit more on causal and Bayesian related topics since I want to become more proficient in actually building tools around these topics. Towards the other end of things I'm going to try to update my online presence, profiles, and such to keep my engagement with others in the community. So with that happy new year and see you in 2026! 🥳

Footnotes


  1. I used OpenAlex API to query the number of publications on the topic of self-driving labs. These were the query results for 2024 and 2025 as of 12/24/2025. 

  2. I don't post my work research here as this is my personal blog. Sometimes I'll post weekend projects or research-related efforts but always within my own time and not using any resources or results from work. 

  3. I never understood why making mistakes in science or technical research is viewed so negatively. I understand feeling embarrassed, but it seems like mistakes can ruin your reputation. It hasn't happened to me yet, but I've always felt that if I published a paper with a major mistake that needed to be retracted, I would (1) be grateful if someone caught and pointed it out... since we (I) don't want erroneous research to mislead others and (2) feel embarrassed, but as long as my intentions were unbiased, objective, and honest, I would acknowledge the mistake and learn from it. 

References

[1] J. Leeman, Y. Liu, J. Stiles, S. Lee, P. Bhatt, L. Schoop, R. Palgrave, Challenges in high-throughput inorganic material prediction and autonomous synthesis, (2024). https://doi.org/10.26434/chemrxiv-2024-5p9j4.

[2] T. Froitzheim, M. Müller, A. Hansen, S. Grimme, g-xTB: A General-Purpose Extended Tight-Binding Electronic Structure Method For the Elements H to Lr (Z=1–103), (2025). https://doi.org/10.26434/chemrxiv-2025-bjxvt.

[3] K. Ghaffari, S. Bavdekar, D.E. Spearot, G. Subhash, Influence of Crystal Orientation and Polymorphism on the Shock Response of Boron Carbide, SSRN Preprint, (2025). https://ssrn.com/abstract=5186595.



Reuse and Attribution