Chapter 344: Deep Red Of 2020
Lin Ran is of course not here.
He finally has such time to rest. Nominally, he is vacationing in Hawaii, but in reality, he is already in the 2020 spacetime.
Attending Nixon’s Election is nothing more than checking in at a historical coordinate.
For Lin Ran who just arrived in the 1960 spacetime, he wouldn’t miss such a historic moment.
But now, now is the time for him to create historic moments. The Huntsville Longzhong Plan, the secret chamber strategy to unite China against the Soviet Union, Hoover’s Assassination—these moments that later generations look back on with extremely strong historical significance were all personally created by him.
As long as I want, I can create historically significant moments anytime.
Lin Ran has such confidence. Now, he has no interest at all in attending Nixon’s presidential inauguration ceremony.
Instead, his own presidential inauguration ceremony, or Nixon’s visit to Yanjing—these two possibilities might make him give up the chance to return to the 2020 spacetime.
The president changes, the White House completes another donkey-elephant switch. Jenny, as the Editor-in-Chief of the New York Times, has no time to rest during this period. Lin Ran also helped her arrange exclusive interviews with Lyndon Johnson and Richard Nixon respectively.
Jenny also has no time to come to Hawaii to find him.
This means that this vacation can be spent entirely in the 2020 spacetime, with only very little time needed in the 1960 spacetime.
That’s right, Lin Ran needs to prepare for Nixon’s term.
Although Nixon’s period is short, it plays an extremely important role.
During this period, the Cold War Soviet Union is unprecedentedly powerful, turning from defense to offense. Nixon personally overturned the Bretton Woods Agreement, various global movements are in full swing, and China returns to the United Nations.
This period is so important, serving as a bridge between past and future. On one hand, Nixon lays the foundation for America to win the Cold War; on the other hand, abolishing Bretton Woods and allowing China to return to the international stage also means opening the prelude to multipolarization.
Of course, Lin Ran needs to prepare well for this era in the 2020 spacetime.
On the other hand, he also needs to make technical preparations for Starlink and Cyber God in 2020.
This time, he plans to stay a full month and a half on the island in Hawaii, which translates to 90 months in the 2020 spacetime.
Of course, 90 months is under ideal conditions; in reality, it might be just around seven years.
Seven years is more than enough in Lin Ran’s view.
Watching Nixon’s speech on television, Lin Ran thinks that America has thus completed the transformation from idealism to realism.
From then on, idealism will cease to exist, and politicians will gradually become tools of financial magnates until the politicians themselves turn into financial magnates.
In November 2022, OpenAI released ChatGPT. In January 2023, ChatGPT’s consumer application rapidly grew to 100 million users, becoming the first application in history to break 100 million consumers, taking only 60 days.
Undoubtedly, if 2016 was the first year of AI, with AlphaGo’s sudden appearance drawing countless capital into AI, China also gave birth to the AI Six Little Dragons led by SenseTime.
The entire capital market is buzzing with artificial intelligence; if it’s not even tangentially related, they’re embarrassed to seek investment.
Then 2022 is absolutely the first year when AI goes from being an esoteric pursuit of the few to mass popularization. Artificial intelligence shifts from an abstract concept—from “I know this thing is awesome, but exactly where is it awesome”—and the public gradually begins to realize.
Of course, for Chinese Citizens, ChatGPT is region-blocked; you need an American ladder to use it, which is still too high a threshold for most people.
From Zhihu to Weibo to Douyin and Bilibili, it’s all uniform reflection.
“Just after Spring Festival, ChatGPT quickly exploded in the capital circle and AI circle. Many practitioners hyped it up.
OneFlow deep learning framework founder Yuan Jinhui told Sina Finance that ChatGPT’s technological progress can be compared to the first moon landing; such progress shocked the industry.”
“An AI practitioner told reporters that artificial intelligence has a wave every five or six years. The previous wave AlphaGo shocked everyone; this wave is ChatGPT.
But the mindset this time is vastly different. When Google’s AI beat the Go world champion, everyone treated it as news, but this time many people are experiencing it from a consumer perspective.
In one month, 1 million users worldwide are using and experiencing it. This is a very disruptive experience. This is also AI’s first large-scale self-dissemination.”
“After ChatGPT’s launch, a Baidu senior executive said in a media interview that he had no interest in discussing ChatGPT, his words mixed with complex emotions.
An artificial intelligence company founder said that facing ChatGPT’s stunning performance, he felt itchy yet confused, sleepless. The other party admitted frankly that from model scale to effect, the gap is still quite large.
Someone asked the same question to a so-called artificial intelligence from a domestic manufacturer and ChatGPT simultaneously. ChatGPT far surpasses the domestic artificial intelligence in answer logic and completeness. The domestic large model’s answers have an obvious patchwork feel, mixed with quite a bit of fabricated content unrelated to the theme, and in response speed, ChatGPT also leads by a chunk.
Tekan Technology CEO Le Cheng, who engages in digital human research and development, believes that currently there is no artificial intelligence globally that can compete with ChatGPT. The industry consensus is that the gap is over two years; domestically, forget about overtaking on a curve—chasing as soon as possible is more important.”
The entire industry is pessimistic.
Pessimistic sentiment is beyond measure. Technical gaps exist, of course, but hardware gaps make practitioners feel despair.
Because if the essence is large models, relying on massive data and massive computing power for training to achieve effects, then China will find it hard to catch up.
The three elements of artificial intelligence: computing power, data, and algorithms. For a long time, China’s practitioners believed their advantage in competing with Silicon Valley lay in algorithms.
After ChatGPT appeared, despite not knowing the technical details, from Sam Altman’s fragmentary words in an interview, it can be seen that it is actually the result of wisdom emergence after training on massive data.
This ChatGPT is GPT-3; OpenAI had previously released GPT-1 and GPT-2, which shows it.
For a time, American morale soared, the US Stock Market was in full swing, and the gloom caused by the virus outbreak was swept away.
Of course, all the major domestic players are urgently mobilizing troops, trying to launch their own large models as soon as possible.
No matter how awesome ChatGPT is, you don’t enter China. We first solve the problem of having it, then talk about chasing.
Among them, the company jointly established by Tencent and Lin Ran himself has Tencent dispatching all troops in the artificial intelligence field, and the Nvidia compute card cluster in Tencent Cloud is also available for dispatch.
Troops don’t move without provisions going first.
Under Pony’s personal promotion, this newly established company, named Alpha Technology by Lin Ran, has the highest permissions and greatest resources in the entire Tencent.
Tencent has nurtured a very, very massive artificial intelligence team, with the entire team totaling a few thousand in scale, and still constantly recruiting soldiers and buying horses.
This proportion is definitely not small, after all, given Tencent’s massive business volume, and not just large models count as artificial intelligence—image recognition, financial risk control, speech recognition, computer vision, etc., all count as artificial intelligence.
Zhao Songxia is one of Tencent’s many algorithm engineers. From November last year, he received an order to be temporarily and urgently transferred to Shanghai for work. His organizational relationship is still at Tencent, but he is working at a company called Alpha Technology.
Why is he called Zhao Songxia? Because the year he was born, his father earned a little money, and the family bought a Panasonic television—the most expensive big item in the house for naming. Besides, “Panasonic asks the child” is also Panasonic.
This is not uncommon.
When Zhao Songxia received the notice, he thought he was being exiled.
Though from Pengcheng to Shanghai isn’t exile, the problem is that in the past, only outsourcing came to headquarters to work—when has headquarters gone to work at an external company?
If not for everyone going, the leader said that except for a few to maintain business, everyone else has to go to Shanghai. The Shanghai company solves accommodation; go for half a year first, then decide based on the situation after half a year.
Zhao Songxia even thought of job-hopping. Recently, he hasn’t received few headhunter calls. As an algorithm engineer with over five years at Tencent, still somewhat related to AI, he’s been especially hot lately.
Only after arriving here did he realize this isn’t exile, but an unprecedented battle—a battle targeting artificial intelligence.
Because too many colleagues engaged in artificial intelligence-related work have come here, whether related to LLM or not, all coming to do LLM.
Even Tencent’s head of the artificial intelligence field, Zhang, the boss who was hired in early 2021 as Tencent’s highest professional level 17 researcher, has come.
Anyone in Tencent he can name is in Shanghai.
“Tencent battle?” Zhao Songxia thinks, “This is quite rare. But can LLM really be solved by a battle?”
In the internet industry, generally when a project is about to launch, on the eve of launch, they gather forces from other groups, then everyone’s work intensity and hours increase, fondly called a battle, meaning concentrating forces to win this fight.
The Hundred-Group Battle, the Didi and Kuaidi contest, all belong to this category.
But such battles are more common in e-commerce like Pinduoduo, Meituan, Taobao, JD.com—after all, there’s Double 11 every year, 618 every year.
For Tencent, it’s really not common. Even for an important game launch considered another cash cow internally at Tencent, it wouldn’t go to such lengths.
Obviously, this time is extraordinary.
Only after seeing Lin Ran at the company did Zhao Songxia know why it’s extraordinary.
“No wonder security is so strict. Even after being cleared, you still have to scan codes every day, and bags opened for inspection—strictness comparable to the airport. No wonder, with the professor here.”
Big boss Pony shows unconditional trust in Lin Ran, believing he can lead Tencent to break through again in artificial intelligence’s moon landing, giving all possible resource support.
Zhao Songxia, or all Tencent engineers involved, will have some doubt in their hearts: you’re awesome, no doubt, top big shot in aerospace and the field of mathematics, also with a PhD in GraphAI, but can you really handle LLM and produce a large model comparable to ChatGPT?
Furthermore, from the arranged accommodation time, everyone can see that the company-provided accommodation is for half a year, meaning Tencent gives such massive resources for half a year—results expected in half a year.
Everyone will have doubts in their hearts.
“Everyone, I won’t do too much self-introduction. I’m Lin Ran. This time, I’m leading everyone in researching our own large model. I call it Alpha.
My goal is to construct a generative artificial intelligence better than GPT within three months.
Our computing power doesn’t match OpenAI, so we must optimize at the algorithm level and from the data angle.
At the same time, we need to solve ChatGPT’s problems, eliminate artificial intelligence hallucinations, provide smarter answers, and have superior abilities.
In short, over the next half year, I need everyone’s cooperation and assistance.
I am the brain; I build its algorithms and underlying architecture, but I need everyone to cooperate on other work.”
In large models, the brain is undoubtedly the most important, but other work is indispensable too—like data preparation, model integration and deployment, code generation testing and debugging, full-stack development and automation.
These works are needed to help LLM go from laboratory to practical application.
“We can decompose the model into multiple expert sub-modules, activate only part of the parameters, with a routing mechanism selecting experts to handle input, extend to dynamic MoE, then introduce adaptive routing to further reduce inference cost.”
“Compress Key-Value cache, reduce memory usage in attention mechanism via latent representations while maintaining multi-head parallelism, then integrate knowledge graphs to alleviate hallucinations, while optimizing low-computing-power training.”
“Use 8-bit floating point format for computation, combine higher precision accumulation to avoid precision loss, extend fine-grained quantization strategies to FP4/INT8 hybrid.”
“Balanced strategy in MoE without auxiliary loss, ensure high expert utilization without extra training burden, extend to unsupervised balancing, applied to edge AI training.”
“Predict multiple subsequent tokens simultaneously, densify training signals to improve data efficiency, combine with chain prediction.”
“Inject facts using knowledge graphs, de-fit the model to correct bias; self-refinement reduces retrieval overhead.”
Zhao Songxia watches with his own eyes as their artificial intelligence named Alpha advances at an unimaginable speed.
He seriously suspects if Professor Lin is too much of a pervert. He comes rarely, but every time brings breakthrough progress.
This time they used a lot of new methods—either paper methods implemented in engineering for the first time, or methods that never appeared before.
Little do they know, though Lin Ran seems to come just two days a week, in reality, two years have passed for him in the 1960 spacetime, and around the LLM route, Lin Ran has very deep thinking.
With five years of experience, Zhao Songxia can only do peripheral work, but it doesn’t stop him from making rapid progress by gnawing papers and listening to Lin Ran’s lectures.
He received the notice in November, started work in Shanghai in December, went back for three days during Spring Festival in between, with full overtime pay.
Before going back for Spring Festival, the executive team led by Pony came to inspect the latest progress.
The huge meeting room was packed. Zhao Songxia sat at the very edge. Lin Ran and Pony chatted amiably in the first row, with other Tencent executives uniformly seated behind.
“Good, since everyone’s here, no more nonsense, let’s get started directly,” Lin Ran said.
This time, he personally demonstrates:
“ChatGPT is great, but it sometimes errs, ‘hallucinates’ non-existent facts, and requires massive computing power.
Our Deep Red, through MoE architecture and MLA technology, to some extent eliminates the hallucination problem and achieves faster responses at lower computing power.”
The engineers in the audience don’t understand why General Manager Lin is so obsessed. Wasn’t it called Alpha before, which sounded nice? Recently, he insists on changing it to Deep Red.
Lin Ran taps the keyboard, and a clean chat interface pops up on the screen: “Please write a historical poem about the Great Wall in Chinese and explain the reasons for building the Great Wall.”
Deep Red’s response appears quickly:
“Towering Great Wall spans ten thousand li, Qin Emperor unites to build mighty passes. Defend against Hu, repel invaders to secure the frontier, eternal through millennia of frost and wind.”
Followed by a detailed explanation: “The Great Wall was first built after Qin Shi Huang unified China, mainly to defend against invasions by northern nomadic peoples like the Xiongnu. Subsequent dynasties like Han and Ming continuously reinforced it. It is not only a military engineering feat but also symbolizes national cohesion. Historical records show the Qin Dynasty mobilized hundreds of thousands of laborers, taking years.”
Lin Ran then demonstrates some common marketplace applications of ChatGPT, where Deep Red shows more accurate, smarter answers.
Until Lin Ran inputs:
“You play a master software engineer of exceptional ability, but can only say one sentence of no more than 20 words each time. I will provide a specific scenario; you describe what the master would say in that scenario. Note: The master cannot directly express his views and must use philosophy or Buddhism-related knowledge in metaphors. Please only reply with the specific dialogue content.
Scenario: An engineer is sitting in front of his notebook, deleting some unit tests that fail due to new feature introduction, just as he prepares to push the code. The master passes behind him.”
Deep Red replies: “Master: Cutting roots for lush leaves yields but fleeting spring.”