Inventor's Paradox

Posts

The LLM as Teammate: Rethinking Software Development

June 29, 2026

The LLM as Teammate: Rethinking Software Development In this era, Artificial Intelligence engines now help people build software. I am not talking about embedding AI capabilities into existing applications — that is another topic altogether — but about using AI to write software itself, removing the programming language barrier for the person building it. In the past, if we needed to build software with a particular programming language, we needed to be proficient in it, or we needed to know someone who was. With AI, I could be less proficient in a platform yet still build something meaningful with it. As long as the LLM has proficiency in that programming language, all is well. And with the LLM joining as part of the "programming team," the limitation of working hours almost vanishes. If I still have sufficient LLM token quota, even starting some coding work at 11 PM is possible — no need to wake a programmer for a task. In some respects that is a win, but in others we get l...

Copying Big Oracle Tables into Iceberg

April 11, 2024

During my piloting of Trino Query Engine (formerly PrestoSQL), I tried several datawarehouse destination options. The first option is using Trino's Hive connector with the data stored in Minio storage accessed using S3 API. The Minio services were run on IBM hardware (ppc64le architecture), but that's another story for another blog post. The metadata were stored in a Hive metastore to serve the metadata, which takes some effort because at some point the metastore need to access the S3 storage (which I don't understand why) and thus need to have proper Hadoop AWS jars. The second option is using Trino's Iceberg Connector to store the data in the same Minio storage and Hive metastore with Iceberg table format. For reference's sake, I will note the version of the software being used in this experiment. Trino version 442, deployed on openshift OKD 4.13 using Pulumi and Trino Helm template as starting point. Using pristine Trino image taken from docker hub (docker.io...

Rants On NFS Lack of File Handle Visibility To Sysadm

June 08, 2023

NFS is a not-so-recent solution to share filesystem across linux nodes. It have some capability that are currently indispensable for Linux Clusters : to lock files across nodes and allow either exclusive or non-exclusive access to the same file. Fault Tolerance / Recovery I have read some papers on NFS, it should be able to recover a restarting host / server. Unfortunately in several occassion we found this to be not quite true, after a host serving NFS being restarted, we have stale handle errors in the client. The workaround is to restart NFS client, and if that still doesn't fix the situation, restart NFS server. In our cases sometimes we need to restart twice across the cluster (because the client hangs running a program over NFS). Some might said program shouldn't be run over NFS (and only data files should) but we have deployed a SAP documented cluster architecture that requires such use of NFS. Locks When a file were locked in the NFS, a lock is being created in the hos...

Decision Making Puzzle 1

June 04, 2023

Lets say that currently you are managing a big multi-year project that have significant impact in your enterprise application landscape. Suddenly there is another significant business initiative that also requires a thoughtful, distinct change in the core applications to support it, and have somewhat shorter timeline. Do you : a) request help from one of your system implementation vendor in the first project, knowing that some parts require some expertise that this vendor's team has, and knowing it would delay your first project because the whole team moves on from working on the first project to the second one, b) finding another team to support the change, knowing there is some effort that will need to be put in preparing budget, preparing procurement process to for the new team, so your involvement in first project will be reduced also, or c) manage the changes between yourself and some peers, knowing that the only person to understand the needed change in the core applic...

Copying Big Oracle tables Using Apache Spark

June 03, 2023

Background Sometimes we need to copy table data from one database to another. Logically the best way to do this is to do database specific export (expdp in oracle lingo) and import in the destination database (impdp in oracle). But sometimes there are deficiencies in this method such as unable to do parallel process for single table, and requirements of DBA access in the database. This post shows how to do table copy using Apache Spark and Apache Zeppelin. Preparations In order to allow Apache Spark access to oracle jdbc connections, we need to add dependency to ojdbc6.jar. To do this, write this paragraph in Zeppelin : %dep z.reset() z.load("/path/to/ojdbc6.jar") Basic Approach The most basic approach to copy table data is to retrieve data in one query and save the resulting records in the target database. val tableNameSrc = "TABLENAME" val tableNameTrg = "TABLENAME" import java.util.Properties Class.forName("ora...

Automating tasks using Python for sending files through SFTP

June 03, 2023

Scheduled background tasks are a staple of IT world. For example, some things (programs) should be done in a certain time in each day, or some procedure should be run a few times in an hour. In this we will discuss how to send a file to another server using SFTP. Crontab Most of the time, background tasks are triggered using UNIX-like crontab, there are some alternatives as well but lets say it is out of current post's scope. The crontab entry usually calls a shell script that in its own triggers other program, such calling a web site using wget or curl, or invoking application command (php yiic commandname). The existing crontab entry for sending the file to another server uses a shell script to prepare the files, create control files, and sending them across the network using sshpass with input redirection. The Challenge The problem we're facing is that sometimes the remote server doesn't respond normally. The cause is still unknown, and after getting incomplete file...