---
abstract: |
  Time series analysis is widely used in extensive areas. Recently, to reduce labeling expenses and benefit various tasks, self-supervised pre-training has attracted immense interest. One mainstream paradigm is masked modeling, which successfully pre-trains deep models by learning to reconstruct the masked content based on the unmasked part. However, since the semantic information of time series is mainly contained in temporal variations, the standard way of randomly masking a portion of time points will seriously ruin vital temporal variations of time series, making the reconstruction task too difficult to guide representation learning. We thus present SimMTM, a Simple pre-training framework for Masked Time-series Modeling. By relating masked modeling to manifold learning, SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold, which eases the reconstruction task by assembling ruined but complementary temporal variations from multiple masked series. SimMTM further learns to uncover the local structure of the manifold, which is helpful for masked modeling. Experimentally, SimMTM achieves state-of-the-art fine-tuning performance compared to the most advanced time series pre-training methods in two canonical time series analysis tasks: forecasting and classification, covering both in- and cross-domain settings. `\update{Code is available at \href{https://github.com/thuml/SimMTM}{https://github.com/thuml/SimMTM}.}`{=latex}
author:
- |
  Jiaxiang Dong[^1], Haixu Wu`\footnotemark[1]`{=latex}, Haoran Zhang, Li Zhang, Jianmin Wang, Mingsheng Long^`\Letter`{=latex}^\
  School of Software, BNRist, Tsinghua University, China\
  `\small `{=latex}`{djx20,z-hr20}@mails.tsinghua.edu.cn,wuhaixu98@gmail.com`\
  `\small `{=latex}`{lizhang,jimwang,mingsheng}@tsinghua.edu.cn`
bibliography:
- reference.bib
title: 'SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling'
---

```{=latex}
\newcommand{\update}[1]{{\textcolor{black}{#1}}}
```
```{=latex}
\maketitle
```
Introduction
============

Time series analysis has attached immense importance in extensive real-world applications, such as financial analysis, energy planning and etc  [@wu2021-autoformer; @xu2021anomaly]. Vast amounts of time series are incrementally collected from IoT and wearable devices. However, the semantic information of time series is mainly buried in human-indiscernible temporal variations, making it difficult to annotate. Recently, self-supervised pre-training has been widely explored  [@liu2021self; @Jiang2022TransferabilityID], which benefits deep models from pretext knowledge learned over large-scale unlabeled data and further promotes the performance of various downstream tasks. Mainly, as a well-recognized pre-training paradigm, masked modeling has achieved great success in many areas, such as masked language modeling (MLM)  [@Devlin2018BERT; @radford2019languageGPT-2; @raffel2020exploring; @Brown2020GPT-1; @Gao2021GPT-3] and masked image modeling (MIM)  [@He2022-MAE; @xie2022simmim; @Li2022-FLIP]. This paper extends pre-training methods to time series, especially masked time-series modeling (MTM).

The canonical technique of masked modeling is to optimize the model by learning to reconstruct the masked content based on the unmasked part [@Devlin2018BERT]. However, unlike images and natural languages whose patches or words contain much even redundant semantic information, we observe that the valuable semantic information of time series is mainly included in the temporal variations, such as the trend, periodicity, and peak valley, which can correspond to unique weather processes, abnormal faults, etc. in the real world. Therefore, directly masking a portion of time points will seriously ruin the temporal variations of the original time series, which makes the reconstruction task too difficult to guide representation learning of time series. (Figure `\ref{fig:intro}`{=latex}).

```{=latex}
\begin{figure*}[tb]\begin{center}
    \center{\includegraphics[width=1\textwidth]{fig/intro_new.pdf}}
    \vspace{-15pt}
\caption{Comparison between (a) canonical masked modeling and (b) SimMTM in both manifold perspective and reconstruction performance. The showcase is to recover 50\% masked time series.}
    \label{fig:intro}
\end{center}
\vspace{-15pt}
\end{figure*}
```
According to the analysis in stacked denoising autoencoders [@Vincent2010StackedDA], as shown in Figure `\ref{fig:intro}`{=latex}, we can view the randomly masked series as the \`\`neighbor" of the original time series outside the manifold and the reconstruction process is to project the masked series back to the manifold of original series. However, as we analyzed above, direct reconstruction may fail since the essential temporal variations are ruined by random masking. Inspired by the manifold perspective, we go beyond the straightforward reconstruction convention of masked modeling and propose a natural idea of reconstructing the original data from its *multiple* \`\`neighbors", i.e. multiple masked series. Although the temporal variations of the original time series have been partially dropped in each randomly masked series, the multiple randomly masked series will complement each other, making the reconstruction process much more accessible than directly reconstructing the original series from a single masked series. This process will also pre-train the model to uncover the local structure of the time series manifold implicitly, thereby benefiting masked modeling and representation learning [@Schroff2015FaceNetAU; @wang2020understanding].

Based on the above insights, we propose the SimMTM as a simple but effective pre-training framework for time series in this paper. Instead of directly reconstructing the masked time points from unmasked parts, SimMTM recovers the original time series from multiple randomly masked time series. Technically, SimMTM presents a neighborhood aggregation design for reconstruction, which is to aggregate the point-wise representations of time series based on similarities learned in the series-wise representation space. In addition to the reconstruction loss, a constraint loss is presented to guide the series-wise representation learning based on the neighborhood assumption of the time series manifold. Empowering by above designs, SimMTM achieves consistent state-of-the-art in various time series analysis tasks when fine-tuning the pre-trained model into downstream tasks, covering both the low-level forecasting and high-level classification tasks, even if there is a clear domain shift between pre-training and fine-tuning datasets. Overall, our contributions are summarized as follows:

-   Inspired by the manifold perspective of masking, we propose a new task for masked time-series modeling, which is to reconstruct the original time series on the manifold based on multiple masked series outside the manifold.

-   Technically, we present SimMTM as a simple but effective pre-training framework, which aggregates point-wise representations for reconstruction based on the similarities learned in series-wise representation space.

-   SimMTM consistently achieves the state-of-the-art fine-tuning performance in typical time series analysis tasks, including low-level forecasting and high-level classification, covering both in- and cross-domain settings.

```{=latex}
\vspace{-5pt}
```
Related Work
============

Self-supervised Pre-training
----------------------------

Self-supervised pre-training is an important research topic for learning generalizable and shared knowledge from large-scale data and further benefiting downstream tasks [@Jiang2022TransferabilityID]. Originally, this topic has been widely explored in computer vision and natural language processing. Elaborative manually-designed self-supervised tasks are presented, which can be roughly categorized into contrastive learning [@he2020momentum; @chen2020simple; @caron2020unsupervised-swav] and masked modeling [@Devlin2018BERT; @He2022-MAE]. Recently, following the well-established contrastive learning and masked modeling paradigms, some self-supervised pre-training methods for time series have been proposed [@franceschi2019unsupervised-pre4; @nonnenmacher2022utilizing-expclr; @sarkar2020self-pre2; @rebjock2021online-pre0; @sun2021adjusting-pre1; @yeche2021neighborhood-ncl; @yang2022unsupervised-btsf; @cheng2023timemae-tkde].

```{=latex}
\vspace{25pt}
```
#### Contrastive learning.

The critical insight of contrastive learning is to optimize the representation space based on the manually designed positive and negative pairs. where representations of positive pairs are optimized to be close to each other. In contrast, negative ones tend to be far apart [@Wu2018UnsupervisedFL; @Jaiswal2020ASO]. The canonical design presented in SimCLR [@tang2020exploring-simCLR] views different augmentations of the same sample as positive pairs and augmentations among different samples as negative pairs.

Recently, in time series pre-training, many designs of positive and negative pairs have been proposed by utilizing the invariant properties of time series. Concretely, to make the representation learning seamlessly related to temporal variations, TimCLR [@yang2022-timeclr] adopts the DTW [@Mueen2016ExtractingOP] to generate phase-shift and amplitude-change augmentations, which is more suitable for time series context. TS2Vec [@Yue2022-TS2Vec] splits multiple time series into several patches and further defines the contrastive loss in both instance-wise and patch-wise aspects. TS-TCC [@eldele2021time-TS-TCC] presents a new temporal contrastive learning task as making the augmentations predict each other's future. Mixing-up [@wickstrom2022mixing-Mixing-up] exploits a data augmentation scheme in which new samples are generated by mixing two data samples. LaST [@wang2022learning-LaST] aims to disentangle the seasonal-trend representations in the latent space based on variational inference. Afterward, CoST [@woo2022cost-CoST] employs contrastive losses in both time and frequency domain to learn discriminative seasonal and trend representations. Besides, TF-C [@Zhang2022-TF-C] proposes a novel time-frequency consistency architecture and optimizes time-based and frequency-based representations of the same example to be close to each other. Note that contrastive learning mainly focuses on the high-level information [@Xie2022RevealingTD], and the series-wise or patch-wise representations inherently mismatch the low-level tasks, such as time series forecasting. Thus, in this paper, we focus on the masked modeling paradigm.

```{=latex}
\vspace{-5pt}
```
#### Masked modeling.

The masked modeling paradigm optimizes the model by learning to reconstruct the masked content from the unmasked part. This paradigm has been widely explored in computer vision and natural language processing, which is to predict the masked words of a sentence [@Devlin2018BERT] and masked patches of an image [@baevski2022data2vec; @He2022-MAE; @xie2022simmim] respectively. As for the time series analysis, TST [@zerveas2021transformer-TST] follows the canonical masked modeling paradigm and learns to predict removed time points based on the remaining time points. Next, PatchTST [@nie2022time-patch] proposes to predict masked subseries-level patches to capture the local semantic information and reduce memory usage. Ti-MAE [@li2023ti-TiMAE] uses mask modeling as an auxiliary task to boost the forecasting and classification performances of advanced Transformer-based methods. However, as we stated before, directly masking time series will ruin the essential temporal variations, making the reconstruction too difficult to guide the representation learning. Unlike the direct reconstruction in previous works, SimMTM presents a new masked modeling task, which is reconstructing the original time series from multiple randomly masked series.

Understanding Masked Modeling
-----------------------------

Masked modeling has been explored in stacked denoising autoencoders [@Vincent2010StackedDA], where masking is viewed as adding noise to the original data and the masked modeling is to project masked data from the neighborhood back to the original manifold, namely denoising. It has recently been widely used in pre-training, which can learn valuable low-level information from data unsupervisedly [@Xie2022RevealingTD]. Inspired by the manifold perspective, we go beyond the classical denoising process and project the masked data back to the manifold by aggregating multiple masked time series within the neighborhood.

SimMTM
======

As aforementioned, to tackle the problem that randomly masking time series will ruin the essential temporal variation information, SimMTM proposes to reconstruct the original time series from multiple masked time series. To implement this, SimMTM first learns similarities among multiple time series in the series-wise representation space and then aggregates the point-wise representations of these time series based on pre-learned series-wise similarities. Next, we will detail the techniques in both model architecture and pre-training protocol aspects.

Overall Architecture
--------------------

The reconstruction process of SimMTM involves the following four modules: masking, representation learning, series-wise similarity learning and point-wise reconstruction.

```{=latex}
\begin{figure*}[t]
\begin{center}
\centerline{\includegraphics[width=1.0\textwidth]{fig/architecture.pdf}}
    \caption{Architecture of SimMTM, which reconstructs the original time series by adaptive aggregating multiple masked time series based on series-wise similarities learned contrastively from data.}
    \label{fig:arch}
\end{center}
\vspace{-20pt}
\end{figure*}
```
#### Masking.

Given $\{\mathbf{x}_{i}\}_{i=1}^{N}$ as a mini-batch of $N$ time series samples, where $\mathbf{x}_{i}\in\mathbb{R}^{L\times C}$ contains $L$ time points and $C$ observed variates, we can easily generate a set of masked series for each sample $\mathbf{x}_{i}$ by randomly masking a portion of time points along the temporal dimension, formalizing by: $$\label{equ:aug_series}
    \{\overline{\mathbf{x}}_{i}^{j}\}_{j=1}^{M}=\operatorname{Mask}_{r}(\mathbf{x}_{i}),$$ where $r\in[0,1]$ denotes the masked portion. $M$ is a hyperparameter for the number of masked time series. $\overline{\mathbf{x}}_{i}^{j}\in\mathbb{R}^{L\times C}$ represents the $j$-th masked time series of $\mathbf{x}_{i}$, where the values of masked time points are replaced by zeros. With above process, we can obtain a batch of augmented time series. For clarity, we present all the $\left(N(M+1)\right)$ input series in a set as $\mathcal{X} = \bigcup_{i=1}^{N}\left(\{\mathbf{x}_{i}\}\cup \{\overline{\mathbf{x}}_{i}^{j}\}_{j=1}^{M}\right)$.

#### Representation learning.

After the encoder and projector layer, we can obtain the point-wise representations $\mathcal{Z}$ and series-wise representations $\mathcal{S}$, which can be formalized as: $$\begin{split}
\mathcal{Z}=\bigcup_{i=1}^{N}\left(\{\mathbf{z}_{i}\}\cup \{\overline{\mathbf{z}}_{i}^{j}\}_{j=1}^{M}\right)=\operatorname{Enocder}(\mathcal{X}), \ \ \mathcal{S}=\bigcup_{i=1}^{N}\left(\{\mathbf{s}_{i}\}\cup \{\overline{\mathbf{s}}_{i}^{j}\}_{j=1}^{M}\right)=\operatorname{Projector}(\mathcal{Z}),
\end{split}$$ where $\mathbf{z}_{i},\overline{\mathbf{z}}_{i}^{j}\in\mathbb{R}^{L\times d_{\text{model}}}$ and $\mathbf{s}_{i},\overline{\mathbf{s}}_{i}^{j}\in\mathbb{R}^{1\times d_{\text{model}}}$. $\operatorname{Encoder}(\cdot)$ denotes the model encoder, which can project the input data into deep representations and will be transferred to downstream tasks during the fine-tuning process. In this paper, we implement the encoder as a well-acknowledged Transformer [@vaswani2017attention-Transformer] and ResNet [@He2016DeepRL]. As for the $\operatorname{Projector}(\cdot)$, we employ a simple MLP layer along the temporal dimension to obtain series-wise representations. More details can be found in Section `\ref{sec:exp}`{=latex}. `\update{Technically, the encoder is applied to input series separately, namely~$\bigcup_{i=1}^N(\operatorname{Encoder}(\mathbf{x}_{i})\cup\{\operatorname{Encoder}(\overline{\mathbf{x}}_{i}^{j})\}_{j=1}^{M})$, and so does for projector. Here we adopt the set-style formalization for conciseness.}`{=latex}

#### Series-wise similarity learning.

Note that directly averaging multiple masked time series will result in the over-smoothing problem [@Vincent2010StackedDA], impeding the representation learning. Thus, to precisely reconstruct the original time series, we attempt to utilize the similarities among series-wise representations $\mathcal{S}$ for weighted aggregation, namely exploiting the local structure of the time series manifold. For simplification, we formalize the calculation of series-wise similarities as follows: $$\begin{split}
\update{
\mathbf{R} = \operatorname{Sim}(\mathcal{S}) \in \mathbb{R}^{D\times D}, D = N(M+1)}, \ \
\mathbf{R}_{\mathbf{u},\mathbf{v}}=\frac{\mathbf{u}\mathbf{v}^{\sf T}}{\|\mathbf{u}\|\|\mathbf{v}\|}, \mathbf{u},\mathbf{v}\in\mathcal{S}, 
\end{split}$$ where $\mathbf{R}$ is the matrix of pair-wise similarities `\update{for $\left(N(M+1)\right)$}`{=latex} input samples in series-wise representation space, which are measured by the cosine distance. $\mathbf{R}_{\mathbf{u},\mathbf{v}}$ is the calculated similarity between series-wise representations $\mathbf{u},\mathbf{v}\in\mathcal{S}$.

#### Point-wise aggregation.

As shown in Figure `\ref{fig:arch}`{=latex}, based on the learned series-wise similarities, the aggregation process for the $i$-th original time series is: $$\label{equ:aggregation}
\begin{split}
    \widehat{\mathbf{z}}_{i} = \sum_{\mathbf{s}^\prime \in \mathcal{S}\backslash\{\mathbf{s}_{i}\}}\frac{\operatorname{exp}({\mathbf{R}_{\mathbf{s}_{i},{\mathbf{s}^\prime}}/\tau})}{\sum_{{\mathbf{s}^\prime}^\prime \in \mathcal{S}\backslash\{\mathbf{s}_{i}\}} \operatorname{exp}({\mathbf{R}_{\mathbf{s}_{i},{\mathbf{s}^\prime}^\prime}/\tau})}\mathbf{z}^\prime,
\end{split}$$ `\update{where $\mathbf{z}^\prime$ represents the corresponding point-wise representation of $\mathbf{s}^\prime$, i.e.~$\mathbf{z}^\prime=\operatorname{Projector}(\mathbf{s}^\prime)$.}`{=latex} $\widehat{\mathbf{z}}_{i}\in\mathbb{R}^{L\times d_{\text{model}}}$ is the reconstructed point-wise representation. $\tau$ denotes the temperature hyperparameter of softmax normalization for series-wise similarities. It is notable that as described in Eq. `\eqref{equ:aggregation}`{=latex}, for each time series $\mathbf{x}_{i}$, the reconstruction is not only based its own masked series $\{\overline{\mathbf{x}}_{i}^{j}\}_{j=1}^{M}$. We also introduce other series representations $\mathcal{S}\backslash\{\mathbf{s}_{i}\}$ into aggregation, which requires the model to suppress the interference of less-related noise series and precisely learn similar representations for both the masked and the original series, namely guiding the model to learn the manifold structure better. After the decoder, we can obtain the reconstructed original time series, namely $$\begin{split}
\{\widehat{\mathbf{x}}_{i}\}_{i=1}^{N}=\operatorname{Decoder}(\{\widehat{\mathbf{z}}_{i}\}_{i=1}^{N}),
\end{split}$$ where $\widehat{\mathbf{x}}_{i}\in\mathbb{R}^{L\times C}$ is the reconstruction to $\mathbf{x}_{i}$. $\operatorname{Decoder}(\cdot)$ is instantiated as a simple MLP layer along the channel dimension following [@xie2022simmim].

Self-supervised Pre-training
----------------------------

Following the masked modeling paradigm, SimMTM is supervised by a reconstruction loss: $${\cal L}_{\text{reconstruction}}=\sum_{i=1}^{N}\|\mathbf{x}_{i}-\widehat{\mathbf{x}}_{i}\|_{2}^{2}.$$ Note that the reconstruction process is directly based on the series-wise similarities, while it is hard to guarantee the model captures the precise similarities without explicit constraints in the series-wise representation space. Thus, to avoid trivial aggregation, we also utilize the neighborhood assumption of the time series manifold to calibrate the structure of series-wise representation space $\mathcal{S}$. For clarity, we formalize the neighborhood assumption by defining positive and negative pairs as follows: $$\label{equ:prior}
\begin{split}
    \text{Positive pairs: }&(\mathbf{s}_{i},\mathbf{s}_{i}^{+}),\ \mathbf{s}_{i}^{+}\in\{\overline{\mathbf{s}}_{i}^{j}\}_{j=1}^{M}\\
    \text{Negative pairs: }& (\mathbf{s}_{i},\mathbf{s}_{i}^{-}),\ \mathbf{s}_{i}^{-}\in\{\mathbf{s}_{k}\}\cup \{\overline{\mathbf{s}}_{k}^{j}\}_{j=1}^{M}, i\neq k
\end{split}$$ `\update{where $\mathbf{s}_{i}^{+}$ and $\mathbf{s}_{i}^{-}$ mean the elements that are assumed as close to and far away from $\mathbf{s}_{i}$ respectively.}`{=latex} Eq. `\eqref{equ:prior}`{=latex} indicates that the original time series and its masked series will present close representations and be far away from the representations from other series in $\mathcal{S}$. For each series-wise representation $\mathbf{s}\in\mathcal{S}$, we define the set of its assumed close series as ${\update{\mathcal{S}}^{+}}\subset \mathcal{S}$. Note that to avoid the dominating representation, we assume that $\mathbf{s} \notin {\update{\mathcal{S}}^{+}}$. With the above formalization, we can define manifold constraint to series-wise representation space as: $$\label{equ:constraint}
\begin{split}
    &{\cal L}_{\text{constraint}}=-\sum_{\mathbf{s}\in\mathcal{S}}\left(\sum_{\mathbf{s}^\prime\in \update{\mathcal{S}}^{+}}\log \frac{\operatorname{exp}(\mathbf{R}_{\mathbf{s},\mathbf{s}^\prime}/\tau)}{\sum_{{\mathbf{s}^\prime}^\prime\in\mathcal{S}\backslash\{\mathbf{s}\}}\operatorname{exp}(\mathbf{R}_{\mathbf{s},{\mathbf{s}^\prime}^\prime}/\tau)}\right),
\end{split}$$ which can optimize the learned series-wise representations to satisfy the neighborhood assumption in Eq. `\eqref{equ:prior}`{=latex} better. Finally, the overall optimization process of SimMTM can be represented as follows: $$\label{equ:overall_loss}
    \min_{\Theta} {\cal L}_{\text{reconstruction}}+\lambda {\cal L}_{\text{constraint}},$$ where $\Theta$ denotes the set of all parameters of the deep architecture. To trade off the two parts in Eq. `\eqref{equ:overall_loss}`{=latex}, we adopt the tuning strategy presented by [@kendall2018multi], which can adjust the hyperparameters $\lambda$ adaptively according to the homoscedastic uncertainty of each loss.

Experiments {#sec:exp}
===========

To fully evaluate SimMTM, we conduct experiments on two typical time series analysis tasks: forecasting and classification, covering low-level and high-level representation learning. Further, we present the fine-tuning performance for each task under in- and cross-domain settings.

#### Benchmarks.

We summarize the experiment benchmarks in Table `\ref{tab:benchmarks}`{=latex}, comprising twelve real-world datasets that cover two mainstream time series analysis tasks: time series forecasting and classification. Concretely, we have followed the standard experimental setups in Autoformer [@wu2021-autoformer] for the forecasting tasks and the experiment settings proposed by TF-C [@Zhang2022-TF-C] for classification.

```{=latex}
\begin{wraptable}{r}{0.5\textwidth}
    \vspace{-5pt}
    \caption{{Summary of experiment benchmarks.}}
    \label{tab:benchmarks}
    \vskip 0.05in
    \centering
    \begin{small}
        \renewcommand{\multirowsetup}{\centering}
        \setlength{\tabcolsep}{3.5pt}
        \renewcommand\arraystretch{1.3}
        \scalebox{1}{
        \begin{tabular}{lcc}
            \toprule
            \scalebox{1.0}{Tasks} & \scalebox{1.0}{Datasets} & \scalebox{1.0}{Semantic} \\
            \midrule
            \multirow{5}{*}{\scalebox{1.0}{Forecasting}} & \scalebox{1.0}{ETTh1,ETTh2} & \scalebox{1.0}{Electricity} \\
                        & \scalebox{1.0}{ETTm1,ETTm2} & \scalebox{1.0}{Electricity} \\
                        & \scalebox{1.0}{Weather} & \scalebox{1.0}{Weather}  \\
                        & \scalebox{1.0}{Electricity} & \scalebox{1.0}{Electricity}  \\
                        & \scalebox{1.0}{Traffic} & \scalebox{1.0}{Transportation}  \\                   
            \midrule
            \multirow{5}{*}{\scalebox{1.0}{Classification}} & \scalebox{1.0}{SleepEEG} & \scalebox{1.0}{EEG} \\
                & \scalebox{1.0}{Epilepsy} & \scalebox{1.0}{EEG} \\
                & \scalebox{1.0}{FD-B} & \scalebox{1.0}{Faulty Detection} \\
            & \scalebox{1.0}{Gesture} & \scalebox{1.0}{Hand Movement} \\
                & \scalebox{1.0}{EMG} & \scalebox{1.0}{Muscle Responses} \\
            \bottomrule
        \end{tabular}}
    \end{small}
    \vspace{-20pt}
\end{wraptable}
```
#### Implementations.

We compare SimMTM with six competitive state-of-the-art baseline methods, including the contrastive learning methods: TF-C [@Zhang2022-TF-C], CoST [@woo2022cost-CoST], TS2Vec [@Yue2022-TS2Vec], LaST [@wang2022learning-LaST], the masked modeling method: Ti-MAE [@li2023ti-TiMAE], TST [@zerveas2021transformer-TST]. TF-C [@Zhang2022-TF-C] and Ti-MAE [@li2023ti-TiMAE] are the previous best pre-training methods. We experiment on both in- and cross-domain settings. For the in-domain setting, we pre-train and fine-tune the model using the same dataset. As for the cross-domain setting, we pre-train the model on a certain dataset and fine-tune the encoder to different datasets. More details are provided in `\update{\underline{Appendix \ref{app:implementation}}.}`{=latex}

#### Unified encoder.

To make a fair comparison, we attempt to unify encoder for these pre-training methods. Concretely, we adopt vanilla Transformer [@vaswani2017attention-Transformer] with the channel independence [@nie2022time-patch] as the unified encoder for forecasting. The channel-independent design allows models to accomplish cross-domain transfer between datasets with different variate numbers. As for the classification, we use 1D-ResNet [@He2016DeepRL] as the shared encoder following [@Zhang2022-TF-C]. Notably, for LaST [@wang2022learning-LaST] and TFC [@Zhang2022-TF-C], since their designs are closely related to model structures, we directly report results from their papers or reproduce with official codes. For other baselines and SimMTM, the results in the main text are from the unified encoder. The results from their official papers are also compared in `\update{\underline{Appendix \ref{app:baselines}}}`{=latex}. For all baselines, the results with a unified encoder generally surpass results reported by themselves.

Main results
------------

We summarize the results in forecasting and classification of in- and cross-domain settings in Figure `\ref{fig:mainresult}`{=latex}. In all these settings, SimMTM outperforms other baselines significantly. It is also notable that although the masking-based method Ti-MAE [@li2023ti-TiMAE] achieves good performance in the forecasting task (x-axis of Figure `\ref{fig:mainresult}`{=latex}), it fails in the classification task (y-axis). Besides, contrastive-based methods fail in low-level forecasting tasks. These results indicate that previous methods cannot simultaneously cover high-level and low-level tasks, highlighting the advantages of SimMTM in task generality.

![Performance comparison of time series pre-training methods in forecasting (*MSE* $\downarrow$) and classification (*Acc* $\uparrow$) tasks, including both in-domain (left) and cross-domain (right) settings.](fig/mainresult.png){#fig:mainresult width="102%"}

```{=latex}
\vspace{-5pt}
```
```{=latex}
\vspace{-20pt}
```
Forecasting
-----------

#### In-domain.

As shown in Table `\ref{tab:forecasting_indomain_short}`{=latex}, empowering by SimMTM pre-training, the model performance is promoted significantly (SimMTM vs. Random init.). Besides, SimMTM consistently outperforms other pre-training methods. On the average of all benchmarks, SimMTM achieves 8.3% MSE reduction and 4.3% MAE reduction compared to the advanced masked modeling baseline Ti-MAE [@li2023ti-TiMAE], 14.7% MSE reduction and 12.0% MAE reduction compared to the contrastive baseline CoST [@woo2022cost-CoST]. It is also notable that both Ti-MAE [@li2023ti-TiMAE] and TST [@zerveas2021transformer-TST] outperform all the contrastive-based baselines. This indicates that masked modeling based on point-wise reconstruction suits the forecasting task better than the series-wise contrastive pre-training.

```{=latex}
\vspace{25pt}
```
```{=latex}
\setlength{\belowcaptionskip}{-0.1cm}
```
```{=latex}
\vskip 0.1in
```
`\label{tab:forecasting_indomain_short}`{=latex} `\centering`{=latex}

```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{1.2pt}
  \renewcommand\arraystretch{1.3}
  \begin{tabular}{c|cccc|cccccccccccc}
    \toprule
    \scalebox{1.0}{Models} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Random init.}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Ti-MAE}} \cite{li2023ti-TiMAE}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TST}} \cite{zerveas2021transformer-TST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{LaST}} \cite{wang2022learning-LaST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TF-C}} \cite{Zhang2022-TF-C}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{CoST}} \cite{woo2022cost-CoST}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TS2Vec}} \cite{Yue2022-TS2Vec}} \\
    \cmidrule(lr){2-17}
    \scalebox{1.0}{Metric} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} \\
    \toprule
    \scalebox{0.9}{ETTh1} & \scalebox{1.0}{\textbf{0.409}} & \scalebox{1.0}{\textbf{0.428}} & \scalebox{1.0}{0.431} & \scalebox{1.0}{0.448} & \scalebox{1.0}{0.423} & \scalebox{1.0}{0.446} & \scalebox{1.0}{0.466} & \scalebox{1.0}{0.462} & \scalebox{1.0}{0.474} & \scalebox{1.0}{0.461} & \scalebox{1.0}{0.527} & \scalebox{1.0}{0.513} & \scalebox{1.0}{0.485} & \scalebox{1.0}{0.472} & \scalebox{1.0}{0.446} & \scalebox{1.0}{0.456} \\
    \midrule
    \scalebox{0.9}{ETTh2} & \scalebox{1.0}{\textbf{0.353}} & \scalebox{1.0}{\textbf{0.390}} & \scalebox{1.0}{0.395} & \scalebox{1.0}{0.427} & \scalebox{1.0}{0.380} & \scalebox{1.0}{0.386} & \scalebox{1.0}{0.404} & \scalebox{1.0}{0.421} & \scalebox{1.0}{0.449} & \scalebox{1.0}{0.459} & \scalebox{1.0}{0.692} & \scalebox{1.0}{0.724} & \scalebox{1.0}{0.399} & \scalebox{1.0}{0.427} & \scalebox{1.0}{0.417} & \scalebox{1.0}{0.468} \\
    \midrule
    \scalebox{0.9}{ETTm1} & \scalebox{1.0}{\textbf{0.348}} & \scalebox{1.0}{\textbf{0.385}} & \scalebox{1.0}{0.356} & \scalebox{1.0}{0.387} & \scalebox{1.0}{0.366} & \scalebox{1.0}{0.391} & \scalebox{1.0}{0.373} & \scalebox{1.0}{0.389} & \scalebox{1.0}{0.398} & \scalebox{1.0}{0.398} & \scalebox{1.0}{0.496} & \scalebox{1.0}{0.474} & \scalebox{1.0}{0.356} & \scalebox{1.0}{0.385} & \scalebox{1.0}{0.699} & \scalebox{1.0}{0.557} \\
    \midrule
    \scalebox{0.9}{ETTm2} & \scalebox{1.0}{\textbf{0.263}} & \scalebox{1.0}{\textbf{0.320}} & \scalebox{1.0}{0.279} & \scalebox{1.0}{0.336} & \scalebox{1.0}{0.267} & \scalebox{1.0}{0.325} & \scalebox{1.0}{0.297} & \scalebox{1.0}{0.347} & \scalebox{1.0}{0.265} & \scalebox{1.0}{0.327} & \scalebox{1.0}{0.465} & \scalebox{1.0}{0.562} & \scalebox{1.0}{0.314} & \scalebox{1.0}{0.365} & \scalebox{1.0}{0.326} & \scalebox{1.0}{0.361} \\
    \midrule
    \scalebox{0.9}{Weather} & \scalebox{1.0}{\textbf{0.230}} & \scalebox{1.0}{0.271} & \scalebox{1.0}{0.239} & \scalebox{1.0}{0.275} & \scalebox{1.0}{0.234} & \scalebox{1.0}{\textbf{0.265}} & \scalebox{1.0}{0.239} & \scalebox{1.0}{0.276} & \scalebox{1.0}{0.232} & \scalebox{1.0}{0.261} & \scalebox{1.0}{0.286} & \scalebox{1.0}{0.349} & \scalebox{1.0}{0.324} & \scalebox{1.0}{0.329} & \scalebox{1.0}{0.233} & \scalebox{1.0}{0.267} \\
    \midrule
    \scalebox{0.9}{Electricity} & \scalebox{1.0}{\textbf{0.162}} & \scalebox{1.0}{\textbf{0.256}} & \scalebox{1.0}{0.212} & \scalebox{1.0}{0.300} & \scalebox{1.0}{0.205} & \scalebox{1.0}{0.296} & \scalebox{1.0}{0.209} & \scalebox{1.0}{0.289} & \scalebox{1.0}{0.186} & \scalebox{1.0}{0.274} & \scalebox{1.0}{0.363} & \scalebox{1.0}{0.432} & \scalebox{1.0}{0.215} & \scalebox{1.0}{0.295} & \scalebox{1.0}{0.213} & \scalebox{1.0}{0.293} \\
    \midrule
    \scalebox{0.9}{Traffic} & \scalebox{1.0}{\textbf{0.392}} & \scalebox{1.0}{\textbf{0.264}} & \scalebox{1.0}{0.490} & \scalebox{1.0}{0.316} & \scalebox{1.0}{0.475} & \scalebox{1.0}{0.310} & \scalebox{1.0}{0.586} & \scalebox{1.0}{0.362} & \scalebox{1.0}{0.713} & \scalebox{1.0}{0.397} & \scalebox{1.0}{0.717} & \scalebox{1.0}{0.456} & \scalebox{1.0}{0.435} & \scalebox{1.0}{0.362} & \scalebox{1.0}{0.470} & \scalebox{1.0}{0.350} \\ 
    \toprule
    \scalebox{0.9}{Avg} & \scalebox{1.0}{\textbf{0.308}} & \scalebox{1.0}{\textbf{0.331}} & \scalebox{1.0}{0.343} & \scalebox{1.0}{0.356} & \scalebox{1.0}{0.336} & \scalebox{1.0}{0.346} & \scalebox{1.0}{0.368} & \scalebox{1.0}{0.364} & \scalebox{1.0}{0.388} & \scalebox{1.0}{0.368} & \scalebox{1.0}{0.507} & \scalebox{1.0}{0.501} & \scalebox{1.0}{0.361} & \scalebox{1.0}{0.376} & \scalebox{1.0}{0.401} & \scalebox{1.0}{0.393} \\ 
    \bottomrule
  \end{tabular}
    \end{small}
  \end{threeparttable}
```
```{=latex}
\vskip 0.1in
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{1.5pt}
  \renewcommand\arraystretch{1.3}
  \vspace{-15pt}
  \begin{tabular}{c|cccccccccccccc}
    \toprule
    \scalebox{1.0}{Models} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Ti-MAE}} \cite{li2023ti-TiMAE}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TST}} \cite{zerveas2021transformer-TST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{LaST}} \cite{wang2022learning-LaST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TF-C}} \cite{Zhang2022-TF-C}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{CoST}} \cite{woo2022cost-CoST}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TS2Vec}} \cite{Yue2022-TS2Vec}} \\
    \cmidrule(lr){2-15}
    \scalebox{1.0}{Metric} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} \\
    \midrule
    \scalebox{0.9}{ETTh2 $\to$ ETTh1} & \scalebox{1.0}{\textbf{0.415}} & \scalebox{1.0}{\textbf{0.430}} & \scalebox{1.0}{0.466} & \scalebox{1.0}{0.456} & \scalebox{1.0}{0.469} & \scalebox{1.0}{0.459} & \scalebox{1.0}{0.443} & \scalebox{1.0}{0.471} & \scalebox{1.0}{0.635} & \scalebox{1.0}{0.634} & \scalebox{1.0}{0.428} & \scalebox{1.0}{0.433} & \scalebox{1.0}{0.517} & \scalebox{1.0}{0.486} \\
    \midrule
    \scalebox{0.9}{ETTm1 $\to$ ETTh1} & \scalebox{1.0}{\textbf{0.422}} & \scalebox{1.0}{\textbf{0.430}} & \scalebox{1.0}{0.495} & \scalebox{1.0}{0.469} & \scalebox{1.0}{0.475} & \scalebox{1.0}{0.463} & \scalebox{1.0}{0.426} & \scalebox{1.0}{0.441} & \scalebox{1.0}{0.700} & \scalebox{1.0}{0.702} & \scalebox{1.0}{0.620} & \scalebox{1.0}{0.541} & \scalebox{1.0}{0.484} & \scalebox{1.0}{0.482} \\
    \midrule
    \scalebox{0.9}{ETTm2 $\to$ ETTh1} & \scalebox{1.0}{\textbf{0.428}} & \scalebox{1.0}{\textbf{0.441}} & \scalebox{1.0}{0.464} & \scalebox{1.0}{0.456} & \scalebox{1.0}{0.453} & \scalebox{1.0}{0.450} & \scalebox{1.0}{0.503} & \scalebox{1.0}{0.507} & \scalebox{1.0}{1.091} & \scalebox{1.0}{0.814} & \scalebox{1.0}{0.598} & \scalebox{1.0}{0.548} & \scalebox{1.0}{0.616} & \scalebox{1.0}{0.550} \\
    \midrule
    \scalebox{0.9}{Weather $\to$ ETTh1} & \scalebox{1.0}{\textbf{0.456}} & \scalebox{1.0}{0.467} & \scalebox{1.0}{0.462} & \scalebox{1.0}{0.464} & \scalebox{1.0}{0.465} & \scalebox{1.0}{\textbf{0.456}} & - & - & - & - & \scalebox{1.0}{0.518} & \scalebox{1.0}{0.487} & \scalebox{1.0}{0.463} & \scalebox{1.0}{0.460} \\
    \midrule
    \scalebox{0.9}{ETTh1 $\to$ ETTm1} & \scalebox{1.0}{\textbf{0.346}} & \scalebox{1.0}{\textbf{0.384}} & \scalebox{1.0}{0.360} & \scalebox{1.0}{0.390} & \scalebox{1.0}{0.373} & \scalebox{1.0}{0.393} & \scalebox{1.0}{0.353} & \scalebox{1.0}{0.390} & \scalebox{1.0}{0.746} & \scalebox{1.0}{0.652} & \scalebox{1.0}{0.370} & \scalebox{1.0}{0.393} & \scalebox{1.0}{0.699} & \scalebox{1.0}{0.557} \\
    \midrule
    \scalebox{0.9}{ETTh2 $\to$ ETTm1} & \scalebox{1.0}{0.365} & \scalebox{1.0}{\textbf{0.384}} & \scalebox{1.0}{0.383} & \scalebox{1.0}{0.402} & \scalebox{1.0}{0.391} & \scalebox{1.0}{0.409} & \scalebox{1.0}{0.475} & \scalebox{1.0}{0.489} & \scalebox{1.0}{0.750} & \scalebox{1.0}{0.654} & \scalebox{1.0}{\textbf{0.363}} & \scalebox{1.0}{0.387} & \scalebox{1.0}{0.694} & \scalebox{1.0}{0.557} \\
    \midrule
    \scalebox{0.9}{ETTm2 $\to$ ETTm1} & \scalebox{1.0}{\textbf{0.351}} & \scalebox{1.0}{\textbf{0.383}} & \scalebox{1.0}{0.390} & \scalebox{1.0}{0.410} & \scalebox{1.0}{0.382} & \scalebox{1.0}{0.402} & \scalebox{1.0}{0.414} & \scalebox{1.0}{0.464} & \scalebox{1.0}{0.758} & \scalebox{1.0}{0.699} & \scalebox{1.0}{0.385} & \scalebox{1.0}{0.412} & \scalebox{1.0}{0.423} & \scalebox{1.0}{0.420} \\
    \midrule
   \scalebox{0.9}{ Weather $\to$ ETTm1} & \scalebox{1.0}{\textbf{0.350}} & \scalebox{1.0}{\textbf{0.383}} & \scalebox{1.0}{0.411} & \scalebox{1.0}{0.423} & \scalebox{1.0}{0.368} & \scalebox{1.0}{0.392} & - & - & - & - & \scalebox{1.0}{0.382} & \scalebox{1.0}{0.403} & \scalebox{1.0}{0.382} & \scalebox{1.0}{0.395} \\ 
    \midrule
   \scalebox{0.9}{Avg} & \scalebox{1.0}{\textbf{0.392}} & \scalebox{1.0}{\textbf{0.413}} & \scalebox{1.0}{0.429} & \scalebox{1.0}{0.434} & \scalebox{1.0}{0.422} & \scalebox{1.0}{0.428} & \scalebox{1.0}{0.436} & \scalebox{1.0}{0.460} & \scalebox{1.0}{0.780} & \scalebox{1.0}{0.693} & \scalebox{1.0}{0.458} & \scalebox{1.0}{0.451} & \scalebox{1.0}{0.535} & \scalebox{1.0}{0.488} \\ 
    \bottomrule
  \end{tabular}
    \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-10pt}
```
#### Cross-domain.

As demonstrated in Table `\ref{tab:forecasting_crodomain_short}`{=latex}, we present multiple scenarios to verify the effectiveness under the cross-domain setting, where SimMTM consistently outperforms other baselines. Note that the channel-independent encoder enables the comparing baselines to be capable of transferring pre-trained models between datasets with different variate numbers: Weather $\to$ {ETTh1, ETTm1}. But for LaST and TF-C with model-specific designs, they cannot be applied to these scenarios. While negative migration has been observed in some cross-domain scenarios, such as Weather $\to$ ETTh1 and ETTh2 $\to$ ETTm1, SimMTM is still significantly overall superior to other baselines.

Classification
--------------

#### In-domain.

We investigate the in-domain pre-training effect on the classification tasks in Table `\ref{tab:classification_incrodomain_short}`{=latex}. Note that different from forecasting, the classification task requires the model to learn high-level time series representations. From Table `\ref{tab:classification_incrodomain_short}`{=latex}, we can find that the contrastive pre-training baselines achieve competitive performances. In contrast, the masking-based model Ti-MAE [@li2023ti-TiMAE] and TST [@zerveas2021transformer-TST] perform poorly, and TST even exhibits a negative transfer phenomenon compared to the random initialization, indicating that contrastive learning is generally more suitable for classification tasks. Surprisingly, although SimMTM follows the masked modeling paradigm, with our specially-designed reconstruction task, it can still achieve the best performance in the classification task. This is benefited from the neighborhood aggregation from *multiple* masked series, which enables the model to exploit the local structure of the time series manifold.

```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{1.5pt}
  \renewcommand\arraystretch{1.2}
  \begin{tabular}{c|cccccccc}
    \toprule
    \scalebox{1.0}{Models} & \rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}} & \rotatebox{0}{\scalebox{0.9}{Random init.}} & \rotatebox{0}{\scalebox{0.9}{Ti-MAE \cite{li2023ti-TiMAE}}} &
    \rotatebox{0}{\scalebox{0.9}{TST \cite{zerveas2021transformer-TST}}}  & \rotatebox{0}{\scalebox{0.9}{LaST \cite{wang2022learning-LaST}}} & \rotatebox{0}{\scalebox{0.9}{TF-C \cite{Zhang2022-TF-C}}}  & \rotatebox{0}{\scalebox{0.9}{CoST \cite{woo2022cost-CoST}}}  &  \rotatebox{0}{\scalebox{0.9}{TS2Vec \cite{Yue2022-TS2Vec}}}  \\
    \midrule
   \scalebox{1.0}{Epilepsy $\to$ Epilepsy} & \scalebox{1.0}{\textbf{94.75}} & \scalebox{1.0}{89.83} & \scalebox{1.0}{80.34} & \scalebox{1.0}{80.89} & \scalebox{1.0}{92.11} & \scalebox{1.0}{93.96} & \scalebox{1.0}{92.35} & \scalebox{1.0}{92.33} \\ 
    \midrule
   \scalebox{1.0}{SleepEEG $\to$ Epilepsy} & \scalebox{1.0}{\textbf{95.49}} & \scalebox{1.0}{89.83} & \scalebox{1.0}{73.45} & \scalebox{1.0}{82.89} & \scalebox{1.0}{86.46} & \scalebox{1.0}{94.95} & \scalebox{1.0}{93.66} & \scalebox{1.0}{94.46} \\ 
    \midrule
   \scalebox{1.0}{SleepEEG $\to$ FD-B} & \scalebox{1.0}{\textbf{69.40}} & \scalebox{1.0}{47.36} & \scalebox{1.0}{67.98} & \scalebox{1.0}{65.57} & \scalebox{1.0}{46.67} & \scalebox{1.0}{69.38} & \scalebox{1.0}{54.82} & \scalebox{1.0}{60.74} \\ 
    \midrule
   \scalebox{1.0}{SleepEEG $\to$ Gesture} & \scalebox{1.0}{\textbf{80.00}} & \scalebox{1.0}{42.19} & \scalebox{1.0}{75.54} & \scalebox{1.0}{75.12} & \scalebox{1.0}{64.17} & \scalebox{1.0}{76.42} & \scalebox{1.0}{73.33} & \scalebox{1.0}{73.33} \\ 
    \midrule
   \scalebox{1.0}{SleepEEG $\to$ EMG} & \scalebox{1.0}{\textbf{97.56}} & \scalebox{1.0}{77.80} & \scalebox{1.0}{63.52} & \scalebox{1.0}{75.89} & \scalebox{1.0}{66.34} & \scalebox{1.0}{81.74} & \scalebox{1.0}{73.17} & \scalebox{1.0}{80.92} \\ 
    \midrule
   \scalebox{1.0}{Avg} & \scalebox{1.0}{\textbf{87.44}} & \scalebox{1.0}{69.40} & \scalebox{1.0}{72.17} & \scalebox{1.0}{76.07} & \scalebox{1.0}{71.15} & \scalebox{1.0}{83.29} & \scalebox{1.0}{77.47} & \scalebox{1.0}{80.36} \\ 
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-5pt}
```
#### Cross-domain.

We experiment with four cross-domain transfer scenarios in Table `\ref{tab:classification_incrodomain_short}`{=latex}: SleepEEG $\to$ {Epilepsy, FD-B, Gesture, EMG}, where the target datasets are distinct from the pre-training dataset. Due to the large gap between pre-training and fine-tuning datasets, the baselines perform poorly in most cases, while SimMTM still surpasses other baselines and the random initialization significantly. Especially for SleepEEG $\to$ EMG, SimMTM remarkably surpasses previous state-of-the-art TF-C (*Accuracy*: 97.56% vs. 81.74%). These results demonstrate that SimMTM can precisely capture valuable knowledge from pre-training datasets and uniformly benefit extensive downstream datasets.

```{=latex}
\begin{figure*}[t]\begin{center}
    \center{\includegraphics[width=1.02\textwidth]{fig/ablition.pdf}}
    \vspace{-10pt}
    \caption{Ablations of SimMTM on the reconstruction loss (${\cal L}_{\text{rec.}}$) and constraint loss (${\cal L}_{\text{con.}}$) in time series forecasting (left part) and classification (right part) tasks under both in-domain and cross-domain settings. More ablations are included in \update{\underline{Appendix \ref{app:fullresults}}}.}
    \label{fig:ablit}
\end{center}
\vspace{-10pt}
\end{figure*}
```
Model Analysis
--------------

#### Ablations.

As shown in Figure `\ref{fig:ablit}`{=latex}, we provide ablations to two parts of the training loss in SimMTM. It is observed that both ${\cal L}_{\text{reconstruction}}$ and ${\cal L}_{\text{constraint}}$ are essential to the final performance. Especially, for the SleepEEG $\to$ EMG experiment, SimMTM surpasses the random initialization remarkably, where reconstruction and constraint losses provide 9.7% and 16.0% absolute improvement respectively. Besides, we can also find that in comparison to ${\cal L}_{\text{reconstruction}}$, ${\cal L}_{\text{constraint}}$ provides more contributions to the final results. This comes from our design that the constraint loss uncovers a proper time series manifold helpful for reconstruction from multiple masked series, without which the neighborhood aggregation will degenerate to the trivial average.

#### Representation analysis.

To illustrate the advantages of SimMTM intuitively, we provide a representation analysis in Table `\ref{tab:cka_analysis}`{=latex}, where we can find the following observations. Firstly, we can find that the CKA value of SimMTM in the classification task is clearly smaller than the values in the forecasting task, where the former is a high-level task and the latter requires low-level representations. These results demonstrate that SimMTM can learn adaptive representations for different tasks, which can be benefited from our design in the pre-training loss. Concretely, the temporal variations of classification pre-training datasets are much more diverse than the forecasting datasets. Thus, the ${\cal L}_{\text{constraint}}$ will be easier for optimization in classification, deriving a smaller CKA value. Secondly, from $|\Delta_{\text{CKA}}|$, it is observed that the models pre-trained from SimMTM present a smaller gap with respect to the final fine-tuned model in representation learning properties, which is why SimMTM can consistently improve downstream tasks.

```{=latex}
\begin{table*}[t]\caption{Representation analysis for different methods in classification and forecasting tasks. For each model, we calculate the Centered Kernel Alignment (CKA) similarity (\%) \cite{Kornblith2019SimilarityON} between representations from the first and the last encoder layers \update{to measure the representation-learning property of deep models}. \update{Since the bottom layer representations usually contain low-level or detailed information, a smaller CKA similarity means the top layer includes different information from the bottom layer and indicates the model tends to learn high-level representations or more abstract information}. For comparison, we also calculate the $|\Delta_{\text{CKA}}|$ between pre-trained and fine-tuned models, where a smaller value indicates a smaller representation gap between pre-training and fine-tuning, and the representations have stronger universality and portability.}
  \label{tab:cka_analysis}
  \vspace{5pt}
  \centering
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{2.8pt}
  \renewcommand\arraystretch{1.2}
  \begin{tabular}{lc|cccccc|c}
    \toprule
    \multicolumn{2}{c}{\scalebox{1.0}{Tasks}} & \scalebox{0.9}{Ti-MAE} \cite{li2023ti-TiMAE} & \scalebox{0.9}{TST} \cite{zerveas2021transformer-TST} & \scalebox{0.9}{LaST} \cite{wang2022learning-LaST} & \scalebox{0.9}{TF-C} \cite{Zhang2022-TF-C} & \scalebox{0.9}{CoST} \cite{woo2022cost-CoST} & \scalebox{0.9}{TS2vec} \cite{Yue2022-TS2Vec} & \scalebox{0.9}{\textbf{SimMTM}} \\ 
    \toprule
    \scalebox{1.0}{\multirow{3}{*}{Classification}} &
    \scalebox{0.9}{Pre-training} & \scalebox{1.0}{84.12} & \scalebox{1.0}{54.98} & \scalebox{1.0}{82.01} & \scalebox{1.0}{85.78} & \scalebox{1.0}{60.74} & \scalebox{1.0}{70.01} & \scalebox{1.0}{33.87} \\
    & \scalebox{0.9}{Fine-tuning} & \scalebox{1.0}{87.26} & \scalebox{1.0}{55.80} & \scalebox{1.0}{79.56} & \scalebox{1.0}{86.30} & \scalebox{1.0}{62.24} & \scalebox{1.0}{69.79} & \scalebox{1.0}{32.84} \\
    \cmidrule(lr){2-9}
    &$|\mathbf{\Delta_{\text{CKA}}}|$ & \scalebox{1.0}{3.14} & \scalebox{1.0}{0.82} & \scalebox{1.0}{2.45} & \scalebox{1.0}{1.53} & \scalebox{1.0}{1.50} & \scalebox{1.0}{\textbf{0.22}} & \scalebox{1.0}{1.04} \\
    \cmidrule(lr){1-9}
    \scalebox{1.0}{\multirow{3}{*}{Forecasting}} &
    \scalebox{0.9}{Pre-training}  & \scalebox{1.0}{83.60} & \scalebox{1.0}{99.76} & \scalebox{1.0}{75.20} & \scalebox{1.0}{59.35} & \scalebox{1.0}{87.09} & \scalebox{1.0}{70.20} & \scalebox{1.0}{97.79} \\
    & \scalebox{0.9}{Fine-tuning} & \scalebox{1.0}{91.24} & \scalebox{1.0}{94.92} & \scalebox{1.0}{79.25} & \scalebox{1.0}{60.60} & \scalebox{1.0}{77.38} & \scalebox{1.0}{83.73} & \scalebox{1.0}{97.89} \\
    \cmidrule(lr){2-9}
    &$|\mathbf{\Delta_{\text{CKA}}}|$  & \scalebox{1.0}{7.64} & \scalebox{1.0}{4.84} & \scalebox{1.0}{4.05} & \scalebox{1.0}{1.25} & \scalebox{1.0}{9.70} & \scalebox{1.0}{13.53} & \scalebox{1.0}{\textbf{0.11}} \\
    \cmidrule(lr){1-9}
    \multicolumn{2}{c}{\scalebox{1.0}{$\text{Sum} |\mathbf{\Delta_{\text{CKA}}}|$}}  & \scalebox{1.0}{10.78} & \scalebox{1.0}{5.66} & \scalebox{1.0}{6.50} & \scalebox{1.0}{2.77} & \scalebox{1.0}{11.20} & \scalebox{1.0}{13.75} & \scalebox{1.0}{\textbf{1.15}} \\
    \bottomrule
  \end{tabular}
  \end{small}
    \vspace{-5pt}
\end{table*}
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{7.5pt}
  \renewcommand\arraystretch{1.2}
  \begin{tabular}{c|ll|ll|ll|ll}
    \toprule
    \scalebox{1.0}{Dataset} & \multicolumn{2}{c}{\scalebox{1.0}{ETTh1}} & \multicolumn{2}{c}{\scalebox{1.0}{ETTh2}} & \multicolumn{2}{c}{\scalebox{1.0}{ETTm1}}  & \multicolumn{2}{c}{\scalebox{1.0}{ETTm2}} \\
    \cmidrule(lr){2-3} \cmidrule(lr){4-5}\cmidrule(lr){6-7} \cmidrule(lr){8-9}
    \scalebox{1.0}{Model} & \scalebox{1.0}{MSE} & \scalebox{1.0}{MAE} & \scalebox{1.0}{MSE} & \scalebox{1.0}{MAE} & \scalebox{1.0}{MSE} & \scalebox{1.0}{MAE} & \scalebox{1.0}{MSE} & \scalebox{1.0}{MAE}  \\
    \toprule
    \scalebox{1.0}{Transformer} \cite{vaswani2017attention-Transformer} & 1.088 & 0.836 & 4.103 & 1.612 & 0.901 & 0.704 & 1.624 & 0.901 \\
    \scalebox{1.0}{\textbf{ + SimMTM }} & \scalebox{1.0}{\textbf{0.927}} & \scalebox{1.0}{\textbf{0.761}} & \scalebox{1.0}{\textbf{3.498}} &
    \scalebox{1.0}{\textbf{1.487}} & \scalebox{1.0}{\textbf{0.809}} & \scalebox{1.0}{\textbf{0.663}} & \scalebox{1.0}{\textbf{1.322}} & \scalebox{1.0}{\textbf{0.808}} \\
    \midrule
    \scalebox{1.0}{Autoformer} \cite{wu2021-autoformer} & \scalebox{1.0}{0.573} & \scalebox{1.0}{0.573} & \scalebox{1.0}{0.550} & \scalebox{1.0}{0.559} & \scalebox{1.0}{0.615} & \scalebox{1.0}{0.528} & \scalebox{1.0}{0.324} & \scalebox{1.0}{0.368} \\
    \scalebox{1.0}{\textbf{ + SimMTM }} & \scalebox{1.0}{\textbf{0.561}} & \scalebox{1.0}{\textbf{0.568}} & \scalebox{1.0}{\textbf{0.543}} & \scalebox{1.0}{\textbf{0.555}} & \scalebox{1.0}{\textbf{0.553}} & \scalebox{1.0}{\textbf{0.505}} & \scalebox{1.0}{\textbf{0.315}} & \scalebox{1.0}{\textbf{0.360}} \\
    \midrule
    \scalebox{1.0}{NS Transformer} \cite{liunon-NS} & \scalebox{1.0}{0.570} & \scalebox{1.0}{0.537} & \scalebox{1.0}{0.526} & \scalebox{1.0}{0.516} & \scalebox{1.0}{0.481} & \scalebox{1.0}{0.456} & \scalebox{1.0}{0.306} & \scalebox{1.0}{0.347} \\
    \scalebox{1.0}{\textbf{ + SimMTM }} & \scalebox{1.0}{\textbf{0.543}} & \scalebox{1.0}{\textbf{0.527}} & \scalebox{1.0}{\textbf{0.493}} & \scalebox{1.0}{\textbf{0.514}} & \scalebox{1.0}{\textbf{0.431}} & \scalebox{1.0}{\textbf{0.455}} & \scalebox{1.0}{\textbf{0.301}} & \scalebox{1.0}{\textbf{0.345}} \\
    \midrule
    \scalebox{1.0}{PatchTST} \cite{nie2022time-patch} & \scalebox{1.0}{0.417} & \scalebox{1.0}{0.431} & \scalebox{1.0}{0.331} & \scalebox{1.0}{0.379} & \scalebox{1.0}{0.352} & \scalebox{1.0}{0.382} & \scalebox{1.0}{0.258} & \scalebox{1.0}{0.317} \\
     \scalebox{1.0}{+ Sub-series Masking}  & \scalebox{1.0}{\textcolor{gray}{0.430$\downarrow$}} & \scalebox{1.0}{\textcolor{gray}{0.445$\downarrow$}} &  \scalebox{1.0}{\textcolor{gray}{0.355$\downarrow$}} & \scalebox{1.0}{\textcolor{gray}{0.394$\downarrow$}} & \scalebox{1.0}{\textbf{0.341}} & \scalebox{1.0}{0.379} & \scalebox{1.0}{0.258} & \scalebox{1.0}{\textcolor{gray}{0.318$\downarrow$}} \\
    \scalebox{1.0}{\textbf{ + SimMTM }} & \scalebox{1.0}{\textbf{0.409}} & \scalebox{1.0}{\textbf{0.428}} &  \scalebox{1.0}{\textbf{0.329}} & \scalebox{1.0}{0.379} & \scalebox{1.0}{0.348} & \scalebox{1.0}{\textbf{0.378}} & \scalebox{1.0}{\textbf{0.254}} & \scalebox{1.0}{\textbf{0.313}} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-10pt}
```
```{=latex}
\begin{figure*}[t]\vspace{25pt}
    \setlength{\abovecaptionskip}{0.cm}
    \setlength{\belowcaptionskip}{-0.cm}
\begin{center}
    \center{\includegraphics[width=\textwidth]{fig/limitdata_masking.pdf}}
    \vspace{-10pt}
\caption{Model analysis. Left part is for fine-tuning ETTh2 pre-trained model to ETTh1 with limited data, where a smaller MSE indicates better performance. Right part presents the MSE performance of SimMTM in the ETTh1 ``input-336-predict-96'' in-domain setting with different masked ratio $r$ and numbers of masked series $M$, where a darker red means better performance.}
    \label{fig:limitdata_masking}
\end{center}
\vspace{-15pt}
\end{figure*}
```
#### Model generality.

From Table `\ref{tab:Generality}`{=latex}, we can find that as a general time series pre-training framework, SimMTM can consistently improve the forecasting performance of diverse `\update{advanced}`{=latex} base models, even for the state-of-the-art time series forecasting model PatchTST [@nie2022time-patch]. This generality also indicates that we can further improve the model's performance by employing advanced base models as encoders. `\update{It is also notable that different from the negative transfer phenomenon caused by the canonical sub-series masked modeling used in the PatchTST paper \cite{nie2022time-patch}, the consistency improvement of SimMTM further verifies the effectiveness of our design.}`{=latex}

```{=latex}
\vspace{-10pt}
```
#### Fine-tuning to limited data scenarios.

One essential application of pre-training models is to provide prior knowledge for downstream tasks, especially for limited data scenarios, which is critical to the fast-adaption of deep models. Thus, to verify the effectiveness of SimMTM and other pre-training methods in data-limited scenarios, we pre-train a model on ETTh2 and fine-tune it to ETTh1 with different choices for the remaining proportions of training data. All results are presented in Figure `\ref{fig:limitdata_masking}`{=latex}. We can find that SimMTM achieves significant performance gains in different data proportions compared to other time series pre-training methods. `\update{Specifically, for the 10\% data fine-tuning setting, SimMTM significantly outperforms the advanced masking-based method Ti-MAE~\cite{li2023ti-TiMAE} (\emph{MSE}: 0.591 vs. 0.660). Compared with the contrastive-based method TF-C \cite{Zhang2022-TF-C}, SimMTM also achieves 26.0\% MSE reduction. These results further verify that SimMTM can effectively capture valuable knowledge from datasets and boost the final performance, even in limited data scenarios.}`{=latex}

```{=latex}
\vspace{-10pt}
```
#### Masking strategy.

Note that the difficulty of reconstructing the original time series increases along with the increase of the masked ratio, but decreases when the number of neighbor masked series increases. We explore the potential relationship between the masked ratio and the number of masked series used for reconstruction, namely $r$ and $M$ in Eq. `\eqref{equ:aug_series}`{=latex} respectively. The experimental results in Figure `\ref{fig:limitdata_masking}`{=latex} show that we need to set $M\propto r$ to obtain better results. Experimentally, we choose the masking ratio as 50% and adopt three masked series for reconstruction throughout this paper. `\update{In addtion, we can find that only using one masked series ($M=1$) for pre-training generally performs worse than the settings with larger $M$, where the latter enables the model to discover the relations between input series and its neighbors. See Figure \ref{fig:intro} for an intuitive understanding. These results further highlight the advantage of our design in manifold learning.}`{=latex}

```{=latex}
\vspace{-5pt}
```
Conclusion
==========

```{=latex}
\vspace{-5pt}
```
This paper presents SimMTM, a simple pre-training framework for masked time-series modeling. Going beyond the previous convention in reconstructing the original time series from unmasked time points, SimMTM proposes a new masked modeling task as reconstructing the original series from its multiple neighbor masked series. Concretely, SimMTM aggregates the point-wise representations based on the series-wise similarities, which are carefully constrained by the neighborhood assumption on the time series manifold. Experimentally, SimMTM can furthest bridge the gap between pre-trained and fine-tuned models, thereby achieving consistent state-of-the-art in distinct forecasting and classification tasks compared to the most advanced time series pre-training methods, covering both in-domain and cross-domain settings. In the future, we will further extend SimMTM to large-scale and diverse pre-training datasets in pursuing the foundation model for time series analysis.

Acknowledgments {#acknowledgments .unnumbered}
===============

This work was supported by the National Key Research and Development Plan (2021YFB1715200), National Natural Science Foundation of China (62022050 and 62021002), and Beijing Nova Program (Z201100006820041).

```{=latex}
\small
```
```{=latex}
\bibliographystyle{plain}
```
```{=latex}
\newpage
```
```{=latex}
\appendix
```
Implementation Details {#app:implementation}
======================

All the experiments are repeated five times, implemented in PyTorch [@Paszke2019PyTorchAI] and conducted on NVIDIA A100 SXM4 40GB GPU. We implement the baselines based on their official implementation and follow the configuration from their original papers. For the metrics, we adopt the mean square error (MSE) and mean absolute error (MAE) for the time series forecasting. As for the classification, accuracy, precision, recall, F1 score, and their average value are recorded.

Dataset Description {#app:dataset}
-------------------

We conduct experiments to evaluate the effect of our method under in-domain and cross-domain settings on twelve real-world datasets for two typical time series analysis tasks: forecasting and classification, covering diverse application scenarios (electricity system, neurological healthcare, human activity recognition, mechanical fault detection, and physical status monitoring), different types of signals (ECG, EMG, acceleration, vibration, power load, weather, and transoirtation), multivariate channel dimensions (from 1 to 862), varying times series lengths (from 96 to 5120) and large span sampling ratio (from 100 Hz to 4000 Hz). The detailed descriptions of these datasets are summarized in Table `\ref{tab:dataset detail}`{=latex}.

```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{3.5pt}
  \renewcommand\arraystretch{2.0}
  \begin{tabular}{c|c|c|c|c|c|c|c}
    \toprule
    \scalebox{0.9}{Tasks} & \scalebox{0.9}{Datasets} & \scalebox{0.9}{Channels} & \scalebox{0.9}{Length} & \scalebox{0.9}{Samples} & \scalebox{0.9}{Classes} & \scalebox{0.9}{Information} & \scalebox{0.9}{Frequency} \\
    \toprule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{Forecasting}}} & \scalebox{0.9}{ETTh1,ETTh2} & 7 & \scalebox{0.9}{\{96,192,336,720\}} & \scalebox{0.9}{8545/2881/2881} & - & \scalebox{0.9}{Electricity} & \scalebox{0.9}{1 Hour} \\
    & \scalebox{0.9}{ETTm1,ETTm2} & \scalebox{0.9}{7} & \scalebox{0.9}{\{96,192,336,720\}} & \scalebox{0.9}{34465/11521/11521} & - & \scalebox{0.9}{Electricity} & \scalebox{0.9}{15 Mins} \\
    & \scalebox{0.9}{Weather} & \scalebox{0.9}{21} & \scalebox{0.9}{\{96,192,336,720\}} & \scalebox{0.9}{36792/5271/10540} & - & \scalebox{0.9}{Weather} & \scalebox{0.9}{10 Mins} \\
    & \scalebox{0.9}{Electricity} & \scalebox{0.9}{321} & \scalebox{0.9}{\{96,192,336,720\}} & \scalebox{0.9}{18317/2633/5261} & - & \scalebox{0.9}{Electricity} & \scalebox{0.9}{1 Hour} \\
    & \scalebox{0.9}{Traffic} & \scalebox{0.9}{862} & \scalebox{0.9}{\{96,192,336,720\}} & \scalebox{0.9}{12185/1757/3509} & - & \scalebox{0.9}{Transportation} & \scalebox{0.9}{1 Hour} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{Classification}}} & \scalebox{0.9}{SleepEEG} & \scalebox{0.9}{1} & \scalebox{0.9}{200} & \scalebox{0.9}{371005/-/-} & \scalebox{0.9}{5} & \scalebox{0.9}{EEG} & \scalebox{0.9}{100 Hz} \\
    & \scalebox{0.9}{Epilepsy} & \scalebox{0.9}{1} & \scalebox{0.9}{178} & \scalebox{0.9}{60/20/11420} & \scalebox{0.9}{2} & \scalebox{0.9}{EEG} & \scalebox{0.9}{174 Hz} \\
    & \scalebox{0.9}{FD-B} & \scalebox{0.9}{1} & \scalebox{0.9}{5120} & \scalebox{0.9}{60/21/135599} & \scalebox{0.9}{3} & \scalebox{0.9}{Faulty Detection} & \scalebox{0.9}{64K Hz} \\
    & \scalebox{0.9}{Gesture} & \scalebox{0.9}{3} & \scalebox{0.9}{315} & \scalebox{0.9}{320/120/120} & \scalebox{0.9}{8} & \scalebox{0.9}{Hand Movement} & \scalebox{0.9}{100 Hz} \\
    & \scalebox{0.9}{EMG} & \scalebox{0.9}{1} & \scalebox{0.9}{1500} & \scalebox{0.9}{122/41/41} & \scalebox{0.9}{3} & \scalebox{0.9}{Muscle responses} & \scalebox{0.9}{4K Hz} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
\(1\) **ETT (4 subsets)** [@zhou2021informer-ETT] contains the time series of oil temperature and power load collected by electricity transformers from July 2016 to July 2018. ETT is a group of four subsets with different recorded frequencies: ETTh1/ETTh2 are recorded every hour, and ETTm1/ETTm2 are recorded every 15 minutes.

\(2\) **WEATHER** [@weatherdata] includes meteorological time series with 21 weather indicators collected every 10 minutes from the Weather Station of the Max Planck Biogeochemistry Institute in 2020.

\(3\) **ELECTRICITY** [@ecldata] records the hourly electricity consumption of 321 clients from 2012 to 2014. Values are in kW of each 15 min. All time labels report to Portuguese hour. However, all days present 96 measures (24$\times$4). For every year in March, time change day (which has only 23 hours), values between 1:00 am and 2:00 am are zero for all points. For every year in October, time change day (which has 25 hours), the values between 1:00 am and 2:00 am aggregated consumption of two hours.

\(4\) **TRAFFIC** [@trafficdata] encompasses the hourly measures of road occupancy rates obtained from 862 sensors situated in the San Francisco Bay area freeways. These measurements were carried out between January 2015 and December 2016.

\(5\) **SLEEPEEG** [@kemp2000analysis-sleepeeg] contains 153 whole-night sleeping electroencephalography (EEG) recordings from 82 healthy subjects. We follow the same data preprocessing approach as [@Zhang2022-TF-C] and get 371,055 univariate brainwaves. Each brainwave is sampled at a frequency of 100 Hz and associated with one of five sleeping stages: Wake, Non-rapid eye movement (3 sub-states), and Rapid Eye Movement.

\(6\) **EPILEPSY** [@andrzejak2001indications-EPILEPSY] monitors the brain activities of 500 subjects with a single-channel EEG sensor. Every subject is recorded for 23.6 seconds of brain activities. The dataset is sampled at 178 Hz and contains 11,500 samples in total. We follow the procedure described by [@Zhang2022-TF-C]. The first four classes (eyes open, eyes closed, EEG measured in the healthy brain region, and EEG measured in the tumor region) of the original five categories of each sample are classified as positive, and the remaining classes are used as negative.

\(7\) **FD-B** [@lessmeier2016condition-FD-B] is generated by electromechanical drive systems. It monitors the condition of rolling bearings and detects their failures based on the monitoring conditions, which include speed, load torque, and radial force. Concretely, FD-B has 13,640 samples in total. Each recording is sampled at 64k Hz with 3-class labels: undamaged, inner damaged, and outer damaged.

\(8\) **GESTURE** [@liu2009uwave-Gesture] are collected from 8 hand gestures based on the paths of hand movement recorded by an accelerometer. The eight gestures are hand swiping left, right, up, and down, hand waving in a counterclockwise or clockwise circle, hand waving in a square, and waving a right arrow, respectively. This dataset contains 440 examples of balanced classification labels that can be used, and each sample includes eight different kinds of gesture categories.

\(9\) **EMG** [@physiobank2000physionet-EMG] is sampled with 4K Hz and consists of 163 single-channel EMG recordings from the anterior tibialis muscle of three healthy volunteers suffering from neuropathy and myopathy. Each patient is a classification category, so each sample is associated with one of three classes.

Baselines Implementation  {#app:baselines}
------------------------

We compare SimMTM against six state-of-the-art baselines. To make a fair and comprehensive comparison, we tried two baseline implementation approaches for forecasting and classification tasks: the unified encoder and reproduced with their official implementation encoder. Notably, LaST [@wang2022learning-LaST] and TF-C [@Zhang2022-TF-C] are closely related to model structures. We directly report results from their papers or reproduce codes with official implementation.

```{=latex}
\setlength{\abovecaptionskip}{0.cm}
```
```{=latex}
\setlength{\belowcaptionskip}{-0.cm}
```
```{=latex}
\vspace{5pt}
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{5pt}
  \renewcommand\arraystretch{1.0}
  \begin{tabular}{c|c|c|c|c}
    \toprule
    Baselines & Task & Encoder & Performance Comparison & Report \\
    \midrule
    \multirow{5}{*}{Ti-MAE \cite{li2023ti-TiMAE}} & \multirow{2}{*}{Forecasting} &  Channel-independent Transformer & better & Main text \\
    \cmidrule(lr){3-5}
     & & Official implementation & & Section \ref{app:fullresults} \\
    \cmidrule(lr){2-5}
     & \multirow{2}{*}{Classification} & 1D-ResNet & better & Main text \\
     \cmidrule(lr){3-5}
      &  & Official implementation &  & Section \ref{app:fullresults} \\
    \midrule
    \multirow{5}{*}{TST \cite{zerveas2021transformer-TST}} & \multirow{2}{*}{Forecasting} & Channel-independent Transformer & better & Main text \\
    \cmidrule(lr){3-5}
     & & Official implementation & & Section \ref{app:fullresults} \\
    \cmidrule(lr){2-5}
     & \multirow{2}{*}{Classification} & 1D-ResNet & better & Main text \\
     \cmidrule(lr){3-5}
      &  & Official implementation &  & Section \ref{app:fullresults} \\
        \midrule
    \multirow{2}{*}{LaST \cite{wang2022learning-LaST}} & \multirow{1}{*}{Forecasting} & Official implementation & / & Main text \\
    \cmidrule(lr){2-5}
     & Classification & Official implementation & / & Main text \\
    \midrule
    \multirow{2}{*}{TF-C \cite{Zhang2022-TF-C}} & \multirow{1}{*}{Forecasting} & Official implementation & / & Main text \\
    \cmidrule(lr){2-5}
     & Classification & Official implementation & / & Main text \\
    \midrule
    \multirow{5}{*}{CoST \cite{woo2022cost-CoST}} & \multirow{2}{*}{Forecasting} & Channel-independent Transformer & better & Main text \\
    \cmidrule(lr){3-5}
     & & Official implementation & & Section \ref{app:fullresults} \\
    \cmidrule(lr){2-5}
     & \multirow{2}{*}{Classification} & 1D-ResNet & better & Main text \\
     \cmidrule(lr){3-5}
      &  & Official implementation &  & Section \ref{app:fullresults} \\
        \midrule
    \multirow{5}{*}{TS2Vec \cite{Yue2022-TS2Vec}} & \multirow{2}{*}{Forecasting} & Channel-independent Transformer & better & Main text \\
    \cmidrule(lr){3-5}
     & & Official implementation & & Section \ref{app:fullresults} \\
    \cmidrule(lr){2-5}
     & \multirow{2}{*}{Classification} & 1D-ResNet & better & Main text \\
     \cmidrule(lr){3-5}
      &  & Official implementation &  & Section \ref{app:fullresults} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
\(1\) **Unified encoder**. We attempt to unify the encoder for these pre-training methods. Specifically, we adopt the vanilla Transformer [@vaswani2017attention-Transformer] with channel independent [@nie2022time-patch] for forecasting to accomplish cross-domain transfer between datasets with different variate numbers. As for the classification, we use 1D-ResNet [@He2016DeepRL] as the encoder following [@Zhang2022-TF-C]. Besides, we do a comprehensive hyperparameter search for all baselines. For the Transformer encoder, we vary the number of Transformer layers in $\{1, 2, 3, 4\}$, select the model dimension from $\{16, 32, 64, 128, 256\}$, and the attention head from $\{4, 8, 16, 32\}$. For the 1D-ResNet, we search the number of 1D-ResNet layers from $\{1, 2, 3, 4\}$, the kernel size from $\{3, 5, 8\}$ respectively. Additionally, for the masked modeling methods TST [@zerveas2021transformer-TST], Ti-MAE [@li2023ti-TiMAE], we also searched the masked ratio $r = \{0.125, 0.25, 0.5, 0.75\}$ for better performance.

\(2\) **Official implementation**. We also implement the baselines following the corresponding official codes, including encoder, hyperparameters, etc. The comparisons are included in Section `\ref{app:fullresults}`{=latex} of this supplementary material. We directly report the results from their original papers for the same set. For mismatched settings, the results are from our implementation.

Finally, for baselines Ti-MAE [@li2023ti-TiMAE], TST [@zerveas2021transformer-TST], CoST [@woo2022cost-CoST], and TS2Vec [@Yue2022-TS2Vec], we report the results based on the unified encoder in the main text. But for baselines LaST [@wang2022learning-LaST] and TF-C [@Zhang2022-TF-C], we report the results of the official code implementation or their original paper, which are limited by their model structures. As a result, the performances of all baselines with unified encoder (that we reported in the [main text]{.underline}) generally surpass their official implementation and results reported in their own paper. Table `\ref{tab:baselines_detail}`{=latex} shows more details. Full experimental results are in Section `\ref{app:fullresults}`{=latex}.

Pre-training and Fine-tuning Configuration {#app:train}
------------------------------------------

We built two types of pre-training and fine-tuning scenarios, in-domain and cross-domain, based on the benchmarks of forecasting and classification tasks to compare the effectiveness of our method and other time series pre-training methods.

We pre-train a model on one subset for forecasting tasks and fine-tune it to the same dataset to build seven in-domain transfer evaluation scenarios. In cross-domain evaluation, we pre-train a model on one specific dataset and use other datasets for fine-tuning. Based on the above settings, we constructed fifteen in- and cross-domain pre-training and fine-tuning experiments, covering the same dataset with the same sampled frequency, different datasets with the same sampled frequency, and different datasets with different sampled frequencies.

We use the same dataset, Epilepsy, to construct the in-domain setting for classification tasks. For the cross-domain setting, we pre-train a model for classification tasks on a univariate time series dataset SleepEEG with the most complex temporal dynamics and the most samples. And then fine-tune the model separately on Epilepsy, FD-B, Gesture, and EMG. Furthermore, we constructed four cross-domain evaluation scenarios by pre-training from SleepEEG and fine-tuning to Epilepsy, FD-B, Gesture, and EMG because of fewer commonalities and the enormous gap among these datasets. Table `\ref{tab:transfersetting}`{=latex} shows detailed pre-training and fine-tuning settings.

```{=latex}
\setlength{\abovecaptionskip}{0.cm}
```
```{=latex}
\setlength{\belowcaptionskip}{-0.cm}
```
```{=latex}
\vspace{5pt}
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{4pt}
  \renewcommand\arraystretch{1.5}
  \begin{tabular}{c|c|c|l}
    \toprule
    \scalebox{0.9}{Tasks} & \scalebox{0.9}{Evaluation} & \scalebox{0.9}{Scenarios} & \scalebox{0.9}{Characteristic} \\
    \toprule
    \multirow{11}{*}{\scalebox{0.9}{Fore.}} & \multirow{7}{*}{\scalebox{0.9}{In-domain}} & \scalebox{0.9}{ETTh1 $\to$ ETTh1} & \multirow{7}{*}{T\scalebox{0.9}{he same dataset with the same frequency}} \\
    &  & \scalebox{0.9}{ETTh2 $\to$ ETTh2} & \\
    &  & \scalebox{0.9}{ETTm1 $\to$ ETTm1} & \\
    &  & \scalebox{0.9}{ETTm2 $\to$ ETTm2} & \\
    &  & \scalebox{0.9}{Weather $\to$ Weather} & \\
    &  & \scalebox{0.9}{Electricity $\to$ Electricity} & \\
    &  & \scalebox{0.9}{Traffic $\to$ Traffic} & \\
    \cmidrule(lr){2-4}
    & \multirow{4}{*}{\scalebox{0.9}{Cross-domain}} & \scalebox{0.9}{ETTh2 $\to$ ETTh1} & \multirow{2}{*}{\scalebox{0.9}{Different datasets with the same frequency.}} \\
    &  & \scalebox{0.9}{ETTm2 $\to$ ETTm1} & \\
    \cmidrule(lr){3-4}
    &  & \scalebox{0.9}{\{ETTm1, ETTm2, Weather\} $\to$ ETTh1} & \multirow{2}{*}{\scalebox{0.9}{Different datasets with\ different frequencies.}} \\
    &  & \scalebox{0.9}{\{ETTh1, ETTh2, Weather\} $\to$ ETTm1} & \\
    \midrule
    \multirow{2}{*}{\scalebox{0.9}{Class.}} & \scalebox{0.9}{In-domain} & \scalebox{0.9}{Epilepsy $\to$ Epilepsy} & \scalebox{0.9}{The same dataset with the same frequency.} \\
    \cmidrule(lr){2-4}
    & \scalebox{0.9}{Cross-domain} & \scalebox{0.9}{SleepEEG $\to$ \{Epilepsy, FD-B, Gesture, EMG\}} & \scalebox{0.9}{Different datasets with different frequencies.} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
Model and Training Configuration {#app:model}
--------------------------------

Following the previous convention, we choose the encoder part of Transformer [@vaswani2017attention-Transformer] with channel independent as the feature extractor for forecasting tasks. For the classification tasks, we adopt 1D-ResNet [@He2016DeepRL] as the encoder following [@Zhang2022-TF-C]. In the pre-training stages, we pre-train the model with different learning rates and batch sizes according to the pre-train datasets. Then we fine-tune it to downstream forecasting and classification tasks supervised by L2 and Cross-Entropy losses, respectively. The configuration details are in Table `\ref{tab:model_detail}`{=latex}.

```{=latex}
\setlength{\abovecaptionskip}{0.cm}
```
```{=latex}
\setlength{\belowcaptionskip}{-0.cm}
```
```{=latex}
\vspace{5pt}
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{3pt}
  \renewcommand\arraystretch{1.8}
  \begin{tabular}{c|cc|ccc|cccc}
    \toprule
    \multirow{2}{*}{Tasks} & \multicolumn{2}{c}{Encoder} & \multicolumn{3}{c}{Pre-training} & \multicolumn{4}{c}{Fine-tuning} \\
    \cmidrule(lr){2-3}\cmidrule(lr){4-6}\cmidrule(lr){7-10}
     & $e_{\rm layers}$ & $d_{\rm model}$ & learning rate  & batch size & epochs & learning rate & loss function & batch size & epochs \\
    \midrule
    Fore. &  2 & 16 & $1\rm{e}\mbox{-}3$ & 32 & 50 & $1\rm{e}\mbox{-}4$ & L2 & \{16,32\} & 10 \\
    \midrule
    Class. & 3 & 128 & $1\rm{e}\mbox{-}4$ & 128 & 10 & $1\rm{e}\mbox{-}4$ & Cross-Entropy & 32 & 300 \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-5pt}
```
Hyperparameter Sensitivity {#app:hysensitivity}
==========================

We verify the hyperparameter sensitivity of SimMTM on ETTh1 in Table `\ref{tab:hyperparameter_sensitivity}`{=latex}, including masked ratio ($\rm r$), the number of masked series ($\rm M$), temperature ($\tau$), masked function ($\rm Mask$), encoder depth ($e_{\rm layers}$), and the hidden dimension ($d_{\rm model}$). Lower MSE and MAE represent better performance.

As shown in Table `\ref{tab:hyperparameter_sensitivity}`{=latex} (a) and `\ref{tab:hyperparameter_sensitivity}`{=latex} (b), we can observe the effect of the method is closely related to the trade-off of the masked ratio and the number of masked series. Hence, a reasonable balance between the two kinds of parameters is critical. For the temperature hyperparameter of softmax normalization ($\tau$), we use an appropriately small $\tau$ that leads to higher differences and diversity of masked sequences. For the masked methods, we chose two masked methods for verification: masking following random distribution and masking following geometric distribution [@zerveas2021transformer-TST]. The results show that the method based on geometric masking is better than random masking modeling. Besides, we can find that 2 encoder layers are enough for reconstruction tasks. Note our method SimMTM consistently performs better than training from scratch under various hyperparameter changes.

```{=latex}
\setlength{\abovecaptionskip}{0.cm}
```
```{=latex}
\setlength{\belowcaptionskip}{-0.cm}
```
```{=latex}
\vspace{5pt}
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{10.0pt}
  \renewcommand\arraystretch{1.8}
  \begin{tabular}{ccccccccc}
   \toprule
    \multicolumn{3}{c}{(a) Masked ratio} & \multicolumn{3}{c}{(b) Masked numbers} & \multicolumn{3}{c}{(c) Temperature} \\
     Ratio & MSE & MAE & Numbers & MSE & MAE & Value & MSE & MAE \\
     \cmidrule(lr){1-3}\cmidrule(lr){4-6} \cmidrule(lr){7-9}
     12.5\% & 0.429 & 0.440 & 1 & 0.429 & 0.437 & 0.02 & \textbf{0.409} & \textbf{0.428} \\
     25\% & 0.427 & 0.434 & 2 & 0.416 & 0.429 & 0.2 & \textbf{0.409} & 0.429 \\
     50\% & \textbf{0.409} & \textbf{0.428} & 3 & \textbf{0.409} & \textbf{0.428} & 2 & 0.416 & 0.428  \\
     75\% & 0.422 & 0.434 & 4 & 0.419 & 0.431 & & & \\
     \multicolumn{3}{c}{(d) Masked function} & \multicolumn{3}{c}{(e) Encoder depth} & \multicolumn{3}{c}{(f) Hidden layer dimension} \\
     Type & MSE & MAE & Layers & MSE & MAE & Dim & MSE & MAE \\
     \cmidrule(lr){1-3}\cmidrule(lr){4-6} \cmidrule(lr){7-9}
      Random & 0.409 & 0.431 & 1 & 0.420 & 0.426 & 16 & \textbf{0.409} & \textbf{0.428}  \\
      Geometric & \textbf{0.409} & \textbf{0.428} & 2 & \textbf{0.409} & \textbf{0.428} & 32 & 0.420 & 0.432 \\
      & & & 3 & 0.421 & 0.430 & 64 & 0.422 & 0.434 \\
      & & & 4 & 0.426 & 0.436 & 128 & 0.428 & 0.444 \\
      \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-5pt}
```
Ablations on Aggregation Setting {#app:ab_aggregation_set}
================================

SimMTM proposes to recover masked time points by the weighted aggregation of multiple neighbors outside the manifold. We explored two types of aggregation settings.

\(1\) **Positive Samples Aggregation (PSA)**: only aggregate multiple positive neighbors (the masked series of the same sample) to reconstruct masked time points.

\(2\) **Positive and Negative Samples Aggregation (PNSA)**: aggregate both positive and negative neighbors (the masked series of all samples) to reconstruct masked time points.

As shown in Table `\ref{tab:ab_aggregation}`{=latex}, although PSA made good progress compared to training from scratch (Random Init.), PNSA is consistently better than SimMTM PSA in all ablation settings. In masked time-series modeling, masking can be viewed as adding noise to the original data, and masked modeling is to project masked data from the neighborhood back to the original manifold. We use positive and negative masked time series as the reconstruction candidates to drive the model to select the positive samples adaptively, which can make the model learn the structure of the manifold better. Therefore, as stated in the Method Section of the main text, we choose positive and negative sample aggregation (PNSA) as the standard aggregation setting of SimMTM.

```{=latex}
\vspace{10pt}
```
```{=latex}
\vspace{5pt}
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{12pt}
  \renewcommand\arraystretch{1.5}
  \begin{tabular}{c|c|c|l|l}
    \toprule
    \scalebox{0.9}{Tasks} & \scalebox{0.9}{Evaluation} & \scalebox{0.9}{Scenarios} & \scalebox{0.9}{Aggregation} & \scalebox{0.9}{Metric} \\
    \toprule
    \multirow{6}{*}{\scalebox{1.0}{Forecasting}} & \multirow{3}{*}{\scalebox{1.0}{In-domain}} & \multirow{3}{*}{\scalebox{1.0}{ETTh1 $\to$ ETTh1}} &  \scalebox{1.0}{Random init.} & 0.431\\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PSA)}} & 0.420 $\uparrow$ \\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PNSA)}} & \textbf{0.409} $\uparrow$ \\
    \cmidrule(lr){2-5}
    & \multirow{3}{*}{\scalebox{1.0}{Cross-domain}} & \multirow{3}{*}{\scalebox{1.0}{ETTh2 $\to$ ETTh1}}  & \scalebox{1.0}{Random init.} & 0.431 \\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PSA)}} & 0.426 $\uparrow$ \\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PNSA)}} & \textbf{0.415} $\uparrow$ \\
    \midrule
    \multirow{6}{*}{\scalebox{1.0}{Classification}} & \multirow{3}{*}{\scalebox{1.0}{In-domain}} & \multirow{3}{*}{\scalebox{1.0}{Epilepsy $\to$ Epilepsy}} & \scalebox{1.0}{Random init.} & 89.83 \\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PSA)}} & 92.56 $\uparrow$ \\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PNSA)}} & \textbf{94.75} $\uparrow$ \\
    \cmidrule(lr){2-5}
    & \multirow{3}{*}{\scalebox{1.0}{Cross-domain}} & \multirow{3}{*}{\scalebox{1.0}{SleepEEG $\to$ EMG}} & \scalebox{1.0}{Random init.} & 77.80\\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PSA)}} & 87.80 $\uparrow$ \\
    &  &  &  \scalebox{1.0}{SimMTM \textbf{(PNSA)}} & \textbf{97.56} $\uparrow$ \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{10pt}
```
Comparison of Masked Modeling {#app:show}
=============================

To investigate the reconstruction process of different masked modeling methods, we plot both original and reconstructed time series from TST and SimMTM in Figure `\ref{fig:show_case}`{=latex}, where TST [@zerveas2021transformer-TST] follows the canonical masked modeling paradigm and learns to predict removed time points based on the remaining time points. In Figure `\ref{fig:show_case}`{=latex}, we can find that direct reconstruction is too difficult in time series, even for the 12.5% masking ratio. As for the 75% masking ratio, TST degenerates more seriously. Because of this poor reconstruction effect, direct reconstruction is difficult to provide reliable guidance to model pre-training. In contrast, our proposed SimMTM can precisely reconstruct the original time series, benefiting the representation learning. These results also support our design in neighborhood reconstruction.

```{=latex}
\newpage
```
```{=latex}
\vspace{-10pt}
```
![Comparison of the canonical masked modeling paradigm TST and neighborhood aggregation masked modeling SimMTM in reconstructing time series. All the cases are shown from ETTh1.](fig/showcase.png){#fig:show_case width="105%"}

```{=latex}
\vspace{-5pt}
```
```{=latex}
\vspace{-15pt}
```
Full Results {#app:fullresults}
============

Due to the limited length of the text, we summarize all the experiments in the main text into two parts: the main experiment and the analytical experiment. We categorize and index them in Tabel `\ref{tab:mainresults_idx}`{=latex}, `\ref{tab:analysisresults_idx}`{=latex}.

`\label{app:mainresults_idx}`{=latex}

```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{13pt}
  \renewcommand\arraystretch{1.1}
  \begin{tabular}{c|c|c|c}
    \toprule
    \scalebox{1.0}{Tasks} & \scalebox{1.0}{Evaluation} & \scalebox{1.0}{Encoder} & \scalebox{1.0}{Tabels Name} \\
    \toprule
    \multirow{4}{*}{\scalebox{1.0}{Forecasting}} & \multirow{2}{*}{\scalebox{1.0}{In-domain}} & \scalebox{1.0}{The model utilized in the original papers} & \scalebox{1.0}{Table \ref{tab:forecasting_indomain_full}} \\
    \cmidrule(lr){3-4}
    &  & \scalebox{1.0}{Transformer with channel independent} & \scalebox{1.0}{Table \ref{tab:forecasting_indomain_full_simmtm}} \\
    \cmidrule(lr){2-4}
    & \multirow{2}{*}{\scalebox{1.0}{Cross-domain}} & \scalebox{1.0}{The model utilized in the original papers} & \scalebox{1.0}{Table \ref{tab:forecasting_crodomain_full}} \\
    \cmidrule(lr){3-4}
    &  & \scalebox{1.0}{Transformer with channel independent} & \scalebox{1.0}{Table \ref{tab:forecasting_crodomain_full_simmtm}}\\
    \midrule
    \multirow{4}{*}{\scalebox{1.0}{Classification}} & \multirow{2}{*}{\scalebox{1.0}{In-domain}} & \scalebox{1.0}{The model utilized in the original papers} & \scalebox{1.0}{Table \ref{tab:classification_incrodomain_papermodel}} \\
    \cmidrule(lr){3-4}
    &  & \scalebox{1.0}{1D-ResNet} & \scalebox{1.0}{Table \ref{tab:classification_incrodomain_resnets}}\\
    \cmidrule(lr){2-4}
    & \multirow{2}{*}{\scalebox{1.0}{Cross-domain}} & \scalebox{1.0}{The model utilized in the original papers} & \scalebox{1.0}{Table \ref{tab:classification_incrodomain_papermodel}} \\
    \cmidrule(lr){3-4}
    &  & \scalebox{1.0}{1D-ResNet} & \scalebox{1.0}{Table \ref{tab:classification_incrodomain_resnets}}\\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-5pt}
```
`\label{app:analysisresults_idx}`{=latex}

```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{23pt}
  \renewcommand\arraystretch{1.1}
  \begin{tabular}{c|c|c|c}
    \toprule
    \scalebox{1.0}{Tasks} & \scalebox{1.0}{Evaluation} & \scalebox{1.0}{Analysis} & \scalebox{1.0}{Tabels Name} \\
    \toprule
    \multirow{4}{*}{\scalebox{1.0}{Forecasting}} & \multirow{2}{*}{\scalebox{1.0}{In-domain}} & \scalebox{1.0}{Ablation study} & \scalebox{1.0}{Table \ref{tab:forecast_in_domain_abs_full}} \\
    \cmidrule(lr){3-4}
    &  & \scalebox{1.0}{Model generality} & \scalebox{1.0}{Table \ref{tab:general_all_results}} \\
    \cmidrule(lr){2-4}
    & \multirow{2}{*}{\scalebox{1.0}{Cross-domain}} & \scalebox{1.0}{Ablation study} & \scalebox{1.0}{Table \ref{tab:forecast_cross_domain_abs_full}} \\
    \cmidrule(lr){3-4}
    &  & \scalebox{1.0}{Limited data} & \scalebox{1.0}{Table \ref{tab:full_data_limited}}\\
    \midrule
    \multirow{2}{*}{\scalebox{1.0}{Classification}} & \scalebox{1.0}{In-domain} & \scalebox{1.0}{Ablation study} & \scalebox{1.0}{Table \ref{tab:full_ablation_classification_abs_full}} \\
    \cmidrule(lr){2-4}
    & \scalebox{1.0}{Cross-domain} & \scalebox{1.0}{Ablation study} & \scalebox{1.0}{Table \ref{tab:full_ablation_classification_abs_full}} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\newpage
```
Limitations {#appendix:limitations}
===========

SimMTM is inspired by the manifold perspective of masked modeling. Although we have provided relatively comprehensive results to verify the model's effectiveness, the model performance still needs theoretical guarantees. In fact, the most high-impact works in the self-supervised pre-training community are without theoretical analysis, such as BERT [@Devlin2018BERT], GPT-3 [@Gao2021GPT-3], MAE [@He2022-MAE] and SimMIM [@xie2022simmim]. Thus, we would like to leave this problem as a future work.

The masking ratio of masked modeling methods is an essential hyper-parameter. Although we have provided a chosen principle to masking ratio $r$ and the number of masked time series $M$ as $M\propto r$ in the main text, we still need to tune these two hyperparameters for different datasets to achieve the best performance. Notably, previous methods also chose the masking ratio solely based on the empirical results [@Devlin2018BERT; @He2022-MAE]. Thus, despite there exist limitations of SimMTM in choosing hyperparameters, the principle of $M\propto r$ can somewhat ease this problem. And the chosen strategy of the masking ratio can also be a potential topic in masked modeling [@Wettig2022ShouldYM].

Social Impacts
==============

This paper presents SimMTM as a new masked modeling method for time series. SimMTM achieves state-of-the-art in two mainstream time series analysis tasks, which can be a good supplement for the self-supervised pre-training community. We will also publish the codebase of time-series pre-training to facilitate future research.

This paper only focuses on the algorithm design. Using all the codes and datasets strictly follows the corresponding licenses (Appendix `\ref{app:dataset}`{=latex}). There is no potential ethical risk or negative social impact.

```{=latex}
\vskip 0.05in
```
`\label{tab:forecasting_indomain_full}`{=latex} `\centering`{=latex}

```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{2.0pt}
  \begin{tabular}{cc|cccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{\scalebox{0.9}{Models}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Random init.}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Ti-MAE}} \cite{li2023ti-TiMAE}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TST}} \cite{zerveas2021transformer-TST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{LaST}} \cite{wang2022learning-LaST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TF-C}} \cite{Zhang2022-TF-C}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{CoST}} \cite{woo2022cost-CoST}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TS2Vec}} \cite{Yue2022-TS2Vec}} \\
    \cmidrule(lr){3-18}
    \multicolumn{2}{c}{\scalebox{0.9}{Metric}} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} \\
    \toprule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{ETTh1}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.379}} & \scalebox{0.9}{\textbf{0.407}} & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.708} & \scalebox{0.9}{0.570} & \scalebox{0.9}{0.503} & \scalebox{0.9}{0.527} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.463} & \scalebox{0.9}{0.406} & \scalebox{0.9}{0.514} & \scalebox{0.9}{0.512} & \scalebox{0.9}{0.709} & \scalebox{0.9}{0.650} \\
    & \scalebox{0.8}{192} & \scalebox{0.9}{\textbf{0.412}} & \scalebox{0.9}{\textbf{0.424}} & \scalebox{0.9}{0.416} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.725} & \scalebox{0.9}{0.587} & \scalebox{0.9}{0.601} & \scalebox{0.9}{0.552} & \scalebox{0.9}{0.484} &  \scalebox{0.9}{0.468} &  \scalebox{0.9}{0.531} &  \scalebox{0.9}{0.540} & 0.655 & 0.590 & 0.927 & 0.757 \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.421}} & \scalebox{0.9}{\textbf{0.431}} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.713} & \scalebox{0.9}{0.589} & \scalebox{0.9}{0.625} & \scalebox{0.9}{0.541} & \scalebox{0.9}{0.580} & \scalebox{0.9}{0.533} & \scalebox{0.9}{0.535} & \scalebox{0.9}{0.545} & \scalebox{0.9}{0.790} & \scalebox{0.9}{0.666} & \scalebox{0.9}{0.986} & \scalebox{0.9}{0.811} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.424}} & \scalebox{0.9}{0.449} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.736} & \scalebox{0.9}{0.618} & \scalebox{0.9}{0.768} & \scalebox{0.9}{0.628} & \scalebox{0.9}{0.432} & \scalebox{0.9}{\textbf{0.432}} & \scalebox{0.9}{0.577} & \scalebox{0.9}{0.562} & \scalebox{0.9}{0.880} & \scalebox{0.9}{0.739} & \scalebox{0.9}{0.967} & \scalebox{0.9}{0.790} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \scalebox{0.9}{\textbf{0.409}} & \scalebox{0.9}{\textbf{0.428}} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.721} & \scalebox{0.9}{0.591} & \scalebox{0.9}{0.624} & \scalebox{0.9}{0.562} & \scalebox{0.9}{0.474} & \scalebox{0.9}{0.461 }& \scalebox{0.9}{0.527} & \scalebox{0.9}{0.513} & \scalebox{0.9}{0.710} & \scalebox{0.9}{0.627} & \scalebox{0.9}{0.897} & \scalebox{0.9}{0.752} \\
    \midrule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{ETTh2}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.293}} & \scalebox{0.9}{\textbf{0.347}} & \scalebox{0.9}{0.325} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.443} & \scalebox{0.9}{0.465} & \scalebox{0.9}{0.335} & \scalebox{0.9}{0.392} & \scalebox{0.9}{0.331} & \scalebox{0.9}{0.390} & \scalebox{0.9}{0.463} & \scalebox{0.9}{0.521} & \scalebox{0.9}{0.465} & \scalebox{0.9}{0.482} & \scalebox{0.9}{0.506} & \scalebox{0.9}{0.477} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.355}} & \scalebox{0.9}{\textbf{0.386}} & \scalebox{0.9}{0.400} & \scalebox{0.9}{0.424} & \scalebox{0.9}{0.533} & \scalebox{0.9}{0.516} & \scalebox{0.9}{0.444} & \scalebox{0.9}{0.441} & \scalebox{0.9}{0.451} & \scalebox{0.9}{0.452} & \scalebox{0.9}{0.525} & \scalebox{0.9}{0.561} & \scalebox{0.9}{0.671} & \scalebox{0.9}{0.599} & \scalebox{0.9}{0.567} & \scalebox{0.9}{0.547} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.370}} & \scalebox{0.9}{\textbf{0.401}} & \scalebox{0.9}{0.405} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.445} & \scalebox{0.9}{0.472} & \scalebox{0.9}{0.455}  & \scalebox{0.9}{0.494} & \scalebox{0.9}{0.460} & \scalebox{0.9}{0.478} & \scalebox{0.9}{0.850} & \scalebox{0.9}{0.883} & \scalebox{0.9}{0.848} & \scalebox{0.9}{0.776} & \scalebox{0.9}{0.694} & \scalebox{0.9}{0.628} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.395}} & \scalebox{0.9}{\textbf{0.427}} & \scalebox{0.9}{0.451} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.507} & \scalebox{0.9}{0.498} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.504} & \scalebox{0.9}{0.552} & \scalebox{0.9}{0.509} & \scalebox{0.9}{0.930} & \scalebox{0.9}{0.932} & \scalebox{0.9}{0.871} & \scalebox{0.9}{0.811} & \scalebox{0.9}{0.728} & \scalebox{0.9}{0.838} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \scalebox{0.9}{\textbf{0.353}} & \scalebox{0.9}{\textbf{0.390}} & \scalebox{0.9}{0.395} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.482} & \scalebox{0.9}{0.488} & \scalebox{0.9}{0.429} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.449} & \scalebox{0.9}{0.457} & \scalebox{0.9}{0.692} & \scalebox{0.9}{0.724} & \scalebox{0.9}{0.714} & \scalebox{0.9}{0.667} & \scalebox{0.9}{0.624} & \scalebox{0.9}{0.623} \\
    \midrule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{ETTm1}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.288}} & \scalebox{0.9}{0.348} & \scalebox{0.9}{0.295} & \scalebox{0.9}{\textbf{0.346}} & \scalebox{0.9}{0.647} & \scalebox{0.9}{0.497} & \scalebox{0.9}{0.454} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.316} & \scalebox{0.9}{0.355} & \scalebox{0.9}{0.419} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.376} & \scalebox{0.9}{0.420} & \scalebox{0.9}{0.563} & \scalebox{0.9}{0.551} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.327}} & \scalebox{0.9}{0.373} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.597} & \scalebox{0.9}{0.508} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.490} & \scalebox{0.9}{0.349} & \scalebox{0.9}{\textbf{0.366}} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.438} & \scalebox{0.9}{0.420} & \scalebox{0.9}{0.451} & \scalebox{0.9}{0.599} & \scalebox{0.9}{0.558} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.363}} & \scalebox{0.9}{\textbf{0.395}} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.699} & \scalebox{0.9}{0.525} & \scalebox{0.9}{0.457} & \scalebox{0.9}{0.451} & \scalebox{0.9}{0.429} & \scalebox{0.9}{0.407} & \scalebox{0.9}{0.540} & \scalebox{0.9}{0.509} & \scalebox{0.9}{0.482} & \scalebox{0.9}{0.494} & \scalebox{0.9}{0.685} & \scalebox{0.9}{0.594} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.412}} & \scalebox{0.9}{\textbf{0.424}} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.786} & \scalebox{0.9}{0.596} & \scalebox{0.9}{0.594} & \scalebox{0.9}{0.488} & \scalebox{0.9}{0.496} & \scalebox{0.9}{0.464} & \scalebox{0.9}{0.552} & \scalebox{0.9}{0.548} & \scalebox{0.9}{0.628} & \scalebox{0.9}{0.578} & \scalebox{0.9}{0.831} & \scalebox{0.9}{0.698} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.348}} & \scalebox{0.9}{\textbf{0.385}} & \scalebox{0.9}{0.356} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.682} & \scalebox{0.9}{0.532} & \scalebox{0.9}{0.494} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.496} & \scalebox{0.9}{0.474} & \scalebox{0.9}{0.477} & \scalebox{0.9}{0.486} & \scalebox{0.9}{0.670} & \scalebox{0.9}{0.600} \\
    \midrule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{ETTm2}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{0.172} & \scalebox{0.9}{0.261} & \scalebox{0.9}{0.175} & \scalebox{0.9}{0.268} & \scalebox{0.9}{0.304} & \scalebox{0.9}{0.357} & \scalebox{0.9}{0.363} & \scalebox{0.9}{0.301} & \scalebox{0.9}{\textbf{0.163}} & \scalebox{0.9}{\textbf{0.255}} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.477} & \scalebox{0.9}{0.276} & \scalebox{0.9}{0.384} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.482} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.223}} & \scalebox{0.9}{\textbf{0.300}} & \scalebox{0.9}{0.240} & \scalebox{0.9}{0.312} & \scalebox{0.9}{0.334} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.342} & \scalebox{0.9}{0.364} & \scalebox{0.9}{0.239} & \scalebox{0.9}{0.303} & \scalebox{0.9}{0.422} & \scalebox{0.9}{0.490} & \scalebox{0.9}{0.500} & \scalebox{0.9}{0.532} & \scalebox{0.9}{0.545} & \scalebox{0.9}{0.536} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{0.282} & \scalebox{0.9}{\textbf{0.331}} & \scalebox{0.9}{0.298} & \scalebox{0.9}{0.351} & \scalebox{0.9}{0.420} & \scalebox{0.9}{0.441} & \scalebox{0.9}{0.414} & \scalebox{0.9}{0.361} & \scalebox{0.9}{\textbf{0.259}} & \scalebox{0.9}{0.366} & \scalebox{0.9}{0.513} & \scalebox{0.9}{0.508} & \scalebox{0.9}{0.680} & \scalebox{0.9}{0.695} & \scalebox{0.9}{0.681} & \scalebox{0.9}{0.744} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.374}} & \scalebox{0.9}{0.388} & \scalebox{0.9}{0.403} & \scalebox{0.9}{0.413} & \scalebox{0.9}{0.508} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.580} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.397} & \scalebox{0.9}{\textbf{0.382}} & \scalebox{0.9}{0.523} & \scalebox{0.9}{0.772} & \scalebox{0.9}{0.925} & \scalebox{0.9}{0.914} & \scalebox{0.9}{0.691} & \scalebox{0.9}{0.837} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.263}} & \scalebox{0.9}{\textbf{0.320}} & \scalebox{0.9}{0.279} & \scalebox{0.9}{0.336} & \scalebox{0.9}{0.392} & \scalebox{0.9}{0.417} & \scalebox{0.9}{0.425} & \scalebox{0.9}{0.371} & \scalebox{0.9}{0.265} & \scalebox{0.9}{0.327} & \scalebox{0.9}{0.465} & \scalebox{0.9}{0.562} & \scalebox{0.9}{0.595} & \scalebox{0.9}{0.631} & \scalebox{0.9}{0.591} & \scalebox{0.9}{0.650} \\
    \midrule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{Weather}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{0.158} & \scalebox{0.9}{\textbf{0.211}} & \scalebox{0.9}{0.166} & \scalebox{0.9}{0.216} & \scalebox{0.9}{0.216} & \scalebox{0.9}{0.280} & \scalebox{0.9}{0.292} & \scalebox{0.9}{0.370} & \scalebox{0.9}{\textbf{0.153}} & \scalebox{0.9}{\textbf{0.211}} & \scalebox{0.9}{0.215} & \scalebox{0.9}{0.296} & \scalebox{0.9}{0.327} & \scalebox{0.9}{0.359} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.462} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.199}} & \scalebox{0.9}{\textbf{0.249}} & \scalebox{0.9}{0.208} & \scalebox{0.9}{0.254} & \scalebox{0.9}{0.303} & \scalebox{0.9}{0.335} & \scalebox{0.9}{0.410} & \scalebox{0.9}{0.473} & \scalebox{0.9}{0.207} & \scalebox{0.9}{0.250} & \scalebox{0.9}{0.267} & \scalebox{0.9}{0.345} & \scalebox{0.9}{0.390} & \scalebox{0.9}{0.422} & \scalebox{0.9}{0.508} & \scalebox{0.9}{0.518} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.246}} & \scalebox{0.9}{0.286} & \scalebox{0.9}{0.257} & \scalebox{0.9}{0.290} & \scalebox{0.9}{0.351} & \scalebox{0.9}{0.358} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.249} & \scalebox{0.9}{\textbf{0.264}} & \scalebox{0.9}{0.299} & \scalebox{0.9}{0.360} & \scalebox{0.9}{0.477} & \scalebox{0.9}{0.446} & \scalebox{0.9}{0.545} & \scalebox{0.9}{0.549} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.317}} & \scalebox{0.9}{0.337} & \scalebox{0.9}{0.326} & \scalebox{0.9}{0.338} & \scalebox{0.9}{0.425} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.539} & \scalebox{0.9}{0.523} & \scalebox{0.9}{0.319} & \scalebox{0.9}{\textbf{0.320}} & \scalebox{0.9}{0.361} & \scalebox{0.9}{0.395} & \scalebox{0.9}{0.551} & \scalebox{0.9}{0.586} & \scalebox{0.9}{0.576} & \scalebox{0.9}{0.572} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.230}} & \scalebox{0.9}{0.271} & \scalebox{0.9}{0.239} & \scalebox{0.9}{0.275} & \scalebox{0.9}{0.324} & \scalebox{0.9}{0.343} & \scalebox{0.9}{0.419} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.232} & \scalebox{0.9}{\textbf{0.261}} & \scalebox{0.9}{0.286} & \scalebox{0.9}{0.349} & \scalebox{0.9}{0.436} & \scalebox{0.9}{0.453} & \scalebox{0.9}{0.516} & \scalebox{0.9}{0.525} \\
    \midrule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{Electricity}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.133}} & \scalebox{0.9}{\textbf{0.223}} & \scalebox{0.9}{0.190} & \scalebox{0.9}{0.279} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.292} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.166} & \scalebox{0.9}{0.254} & \scalebox{0.9}{0.366} & \scalebox{0.9}{0.436} & \scalebox{0.9}{0.230} & \scalebox{0.9}{0.353} & \scalebox{0.9}{0.322} & \scalebox{0.9}{0.401} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.147}} & \scalebox{0.9}{\textbf{0.237}} & \scalebox{0.9}{0.195} & \scalebox{0.9}{0.285} & \scalebox{0.9}{0.400} & \scalebox{0.9}{0.460} & \scalebox{0.9}{0.270} & \scalebox{0.9}{0.373} & \scalebox{0.9}{0.178} & \scalebox{0.9}{0.278} & \scalebox{0.9}{0.366} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.253} & \scalebox{0.9}{0.371} & \scalebox{0.9}{0.343} & \scalebox{0.9}{0.416} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.166}} & \scalebox{0.9}{\textbf{0.265}} & \scalebox{0.9}{0.211} & \scalebox{0.9}{0.301} & \scalebox{0.9}{0.564} & \scalebox{0.9}{0.573} & \scalebox{0.9}{0.334} & \scalebox{0.9}{0.323} & \scalebox{0.9}{0.186} & \scalebox{0.9}{0.275} & \scalebox{0.9}{0.358} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.197} & \scalebox{0.9}{0.287} & \scalebox{0.9}{0.362} & \scalebox{0.9}{0.435} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.203}} & \scalebox{0.9}{0.297} & \scalebox{0.9}{0.253} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.880} & \scalebox{0.9}{0.770} & \scalebox{0.9}{0.344} & \scalebox{0.9}{0.346} & \scalebox{0.9}{0.213} & \scalebox{0.9}{\textbf{0.288}} & \scalebox{0.9}{0.363} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.230} & \scalebox{0.9}{0.328} & \scalebox{0.9}{0.388} & \scalebox{0.9}{0.456} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.162}} & \scalebox{0.9}{\textbf{0.256}} & \scalebox{0.9}{0.212} & \scalebox{0.9}{0.300} & \scalebox{0.9}{0.561} & \scalebox{0.9}{0.554} & \scalebox{0.9}{0.310} & \scalebox{0.9}{0.353} & \scalebox{0.9}{0.186} & \scalebox{0.9}{0.274} & \scalebox{0.9}{0.363} & \scalebox{0.9}{0.432} & \scalebox{0.9}{0.228} & \scalebox{0.9}{0.335} & \scalebox{0.9}{0.354} & \scalebox{0.9}{0.427} \\
    \midrule
    \scalebox{0.9}{\multirow{5}{*}{\rotatebox{90}{Traffic}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.368}} & \scalebox{0.9}{\textbf{0.262}} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.309} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.482} & \scalebox{0.9}{0.559} & \scalebox{0.9}{0.454} & \scalebox{0.9}{0.706} & \scalebox{0.9}{0.385} & \scalebox{0.9}{0.613} & \scalebox{0.9}{0.340} & \scalebox{0.9}{0.751} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.321} & \scalebox{0.9}{0.367} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.373}} & \scalebox{0.9}{\textbf{0.251}} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.308} & \scalebox{0.9}{0.491} & \scalebox{0.9}{0.346} & \scalebox{0.9}{0.583} & \scalebox{0.9}{0.493} & \scalebox{0.9}{0.709} & \scalebox{0.9}{0.388} & \scalebox{0.9}{0.619} & \scalebox{0.9}{0.516} & \scalebox{0.9}{0.751} & \scalebox{0.9}{0.424} & \scalebox{0.9}{0.476} & \scalebox{0.9}{0.367} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.395}} & \scalebox{0.9}{\textbf{0.254}} & \scalebox{0.9}{0.490} & \scalebox{0.9}{0.315} & \scalebox{0.9}{0.502} & \scalebox{0.9}{0.384} & \scalebox{0.9}{0.637} & \scalebox{0.9}{0.469} & \scalebox{0.9}{0.714} & \scalebox{0.9}{0.394} & \scalebox{0.9}{0.785} & \scalebox{0.9}{0.497} & \scalebox{0.9}{0.761} & \scalebox{0.9}{0.425} & \scalebox{0.9}{0.499} & \scalebox{0.9}{0.376} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.432}} & \scalebox{0.9}{\textbf{0.290}} & \scalebox{0.9}{0.524} & \scalebox{0.9}{0.332} & \scalebox{0.9}{0.533} & \scalebox{0.9}{0.543} & \scalebox{0.9}{0.663} & \scalebox{0.9}{0.594} & \scalebox{0.9}{0.723} & \scalebox{0.9}{0.421} & \scalebox{0.9}{0.850} & \scalebox{0.9}{0.472} & \scalebox{0.9}{0.780} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.563} & \scalebox{0.9}{0.390} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.392}} & \scalebox{0.9}{\textbf{0.264}} & \scalebox{0.9}{0.490} & \scalebox{0.9}{0.316} & \scalebox{0.9}{0.489} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.611} & \scalebox{0.9}{0.503} & \scalebox{0.9}{0.713} & \scalebox{0.9}{0.397} & \scalebox{0.9}{0.717} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.761} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.501} & \scalebox{0.9}{0.375} \\
    \bottomrule
  \end{tabular}
    \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-10pt}
```
```{=latex}
\newpage
```
```{=latex}
\vskip 0.05in
```
```{=latex}
\centering
```
`\label{tab:forecasting_indomain_full_simmtm}`{=latex}

```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{2.1pt}
  \begin{tabular}{cc|cccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{\scalebox{0.9}{Models}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Random init.}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Ti-MAE}} \cite{li2023ti-TiMAE}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TST}} \cite{zerveas2021transformer-TST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{LaST}} \cite{wang2022learning-LaST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TF-C}} \cite{Zhang2022-TF-C}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{CoST}} \cite{woo2022cost-CoST}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TS2Vec}} \cite{Yue2022-TS2Vec}} \\
    \cmidrule(lr){3-18}
    \multicolumn{2}{c}{\scalebox{0.9}{Metric}} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} \\
    \toprule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{ETTh1}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{0.379} & \scalebox{0.9}{\textbf{0.407}} & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.412} & \scalebox{0.9}{\textbf{0.356}} & \scalebox{0.9}{0.420} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.425} & - & - & - & - & \scalebox{0.9}{0.422} & \scalebox{0.9}{0.436} & \scalebox{0.9}{0.392} & \scalebox{0.9}{0.420} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.412}} & \scalebox{0.9}{\textbf{0.424}} & \scalebox{0.9}{0.416} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.421} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.432} & - & - & - & - & \scalebox{0.9}{0.520} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.445} & \scalebox{0.9}{0.452} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.421}} & \scalebox{0.9}{\textbf{0.431}} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.447} & \scalebox{0.9}{0.446} & \scalebox{0.9}{0.519} & \scalebox{0.9}{0.487} & - & - & - & - & \scalebox{0.9}{0.472} & \scalebox{0.9}{0.462} & \scalebox{0.9}{0.453} & \scalebox{0.9}{0.455} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.424}} & \scalebox{0.9}{\textbf{0.449}} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.469} & \scalebox{0.9}{0.482} & \scalebox{0.9}{0.515} & \scalebox{0.9}{0.504} & - & - & - & - & \scalebox{0.9}{0.525} & \scalebox{0.9}{0.501} & \scalebox{0.9}{0.495} & \scalebox{0.9}{0.496} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.409}} & \scalebox{0.9}{\textbf{0.428}} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.423} & \scalebox{0.9}{0.446} & \scalebox{0.9}{0.466} & \scalebox{0.9}{0.462} & - & - & - & - & \scalebox{0.9}{0.485} & \scalebox{0.9}{0.472} & \scalebox{0.9}{0.446} & \scalebox{0.9}{0.456} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{ETTh2}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.293}} & \scalebox{0.9}{\textbf{0.347}} & \scalebox{0.9}{0.325} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.339} & \scalebox{0.9}{0.378} & \scalebox{0.9}{0.322} & \scalebox{0.9}{0.358} & - & - & - & - & \scalebox{0.9}{0.321} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.365} & \scalebox{0.9}{0.509} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.355}} & \scalebox{0.9}{\textbf{0.386}} & \scalebox{0.9}{0.400} & \scalebox{0.9}{0.424} & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.402} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.435} & - & - & - & - & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.403} & \scalebox{0.9}{0.396} & \scalebox{0.9}{0.422} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.370}} & \scalebox{0.9}{\textbf{0.401}} & \scalebox{0.9}{0.405} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.388} & \scalebox{0.9}{0.323} & \scalebox{0.9}{0.420} & \scalebox{0.9}{0.440} & - & - & - & - & \scalebox{0.9}{0.430} & \scalebox{0.9}{0.451} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.436} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.395}} & \scalebox{0.9}{\textbf{0.427}} & \scalebox{0.9}{0.451} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.414} & \scalebox{0.9}{0.442} & \scalebox{0.9}{0.424} & \scalebox{0.9}{0.452} & - & - & - & - & \scalebox{0.9}{0.466} & \scalebox{0.9}{0.480} & \scalebox{0.9}{0.508} & \scalebox{0.9}{0.503} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.353}} & \scalebox{0.9}{\textbf{0.390}} & \scalebox{0.9}{0.395} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.380 }& \scalebox{0.9}{0.386} & \scalebox{0.9}{0.404} & \scalebox{0.9}{0.421} & - & - & - & - & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.417} & \scalebox{0.9}{0.468} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{ETTm1}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.288}} & \scalebox{0.9}{0.348} & \scalebox{0.9}{0.295} & \scalebox{0.9}{0.346} & \scalebox{0.9}{0.305} & \scalebox{0.9}{0.351} & \scalebox{0.9}{0.310} & \scalebox{0.9}{0.348} & - & - & - & - & \scalebox{0.9}{0.291} & \scalebox{0.9}{\textbf{0.343}} & \scalebox{0.9}{0.681} & \scalebox{0.9}{0.689} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.327}} & \scalebox{0.9}{0.373} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.343} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.362} & \scalebox{0.9}{0.380} & - & - & - & - & \scalebox{0.9}{0.330} & \scalebox{0.9}{\textbf{0.370}} & \scalebox{0.9}{0.689} & \scalebox{0.9}{0.551} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.363}} & \scalebox{0.9}{\textbf{0.395}} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.407} & \scalebox{0.9}{0.389} & \scalebox{0.9}{0.402} & - & - & - & - & \scalebox{0.9}{0.382} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.704} & \scalebox{0.9}{0.559} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.412}} & \scalebox{0.9}{\textbf{0.424}} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.432} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.427} & - & - & - & - & \scalebox{0.9}{0.422} & \scalebox{0.9}{0.425} & \scalebox{0.9}{0.721} & \scalebox{0.9}{0.571} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.348}} & \scalebox{0.9}{\textbf{0.385}} & \scalebox{0.9}{0.356} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.366} & \scalebox{0.9}{0.391} & \scalebox{0.9}{0.373} & \scalebox{0.9}{0.389} & - & - & - & - & \scalebox{0.9}{0.356} & \scalebox{0.9}{0.385} & \scalebox{0.9}{0.699} & \scalebox{0.9}{0.557} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{ETTm2}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.172}} & \scalebox{0.9}{0.261} & \scalebox{0.9}{0.175} & \scalebox{0.9}{0.268} & \scalebox{0.9}{0.174} & \scalebox{0.9}{\textbf{0.258}} & \scalebox{0.9}{0.215} & \scalebox{0.9}{0.296} & - & - & - & - & 0.242 & 0.333 & 0.224 & 0.303 \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.223}} & \scalebox{0.9}{\textbf{0.300}} & \scalebox{0.9}{0.240} & \scalebox{0.9}{0.312} & \scalebox{0.9}{0.257} & \scalebox{0.9}{0.303} & \scalebox{0.9}{0.259} & \scalebox{0.9}{0.323} & - & - & - & - & \scalebox{0.9}{0.283} & \scalebox{0.9}{0.345} & \scalebox{0.9}{0.273} & \scalebox{0.9}{0.331} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{0.282} & \scalebox{0.9}{\textbf{0.331}} & \scalebox{0.9}{0.298} & \scalebox{0.9}{0.351} & \scalebox{0.9}{\textbf{0.277}} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.319} & \scalebox{0.9}{0.364} & - & - & - & - & \scalebox{0.9}{0.303} & \scalebox{0.9}{0.349} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.402} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{0.374} & \scalebox{0.9}{\textbf{0.388}} & \scalebox{0.9}{0.403} & \scalebox{0.9}{0.413} & \scalebox{0.9}{\textbf{0.360}} & \scalebox{0.9}{0.404} & \scalebox{0.9}{0.395} & \scalebox{0.9}{0.405} & - & - & - & - & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.406} & \scalebox{0.9}{0.408} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \scalebox{0.9}{\textbf{0.263}} & \scalebox{0.9}{\textbf{0.320}} & \scalebox{0.9}{0.279} & \scalebox{0.9}{0.336} & \scalebox{0.9}{0.267} & \scalebox{0.9}{0.325} & \scalebox{0.9}{0.297} & \scalebox{0.9}{0.347} & - & - & - & - & \scalebox{0.9}{0.314} & \scalebox{0.9}{0.365} & \scalebox{0.9}{0.326} & \scalebox{0.9}{0.361} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{Weather}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{0.158} & \scalebox{0.9}{0.211} & \scalebox{0.9}{0.166} & \scalebox{0.9}{0.216} & \scalebox{0.9}{\textbf{0.153}} & \scalebox{0.9}{\textbf{0.196}} & \scalebox{0.9}{0.162} & \scalebox{0.9}{0.214} & - & - & - & - & \scalebox{0.9}{0.216} & \scalebox{0.9}{0.280} & \scalebox{0.9}{0.154} & \scalebox{0.9}{0.205} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.199}} & \scalebox{0.9}{0.249} & \scalebox{0.9}{0.208} & \scalebox{0.9}{0.254} & \scalebox{0.9}{0.214} & \scalebox{0.9}{0.253} & \scalebox{0.9}{0.203} & \scalebox{0.9}{0.252} & - & - & - & - & \scalebox{0.9}{0.303} & \scalebox{0.9}{0.335} & \scalebox{0.9}{0.200} & \scalebox{0.9}{\textbf{0.243}} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{0.246} & \scalebox{0.9}{0.286} & \scalebox{0.9}{0.257} & \scalebox{0.9}{0.290} & \scalebox{0.9}{\textbf{0.243}} & \scalebox{0.9}{\textbf{0.272}} & \scalebox{0.9}{0.260} & \scalebox{0.9}{0.297} & - & - & - & - & \scalebox{0.9}{0.351} & \scalebox{0.9}{0.358} & \scalebox{0.9}{0.252} & \scalebox{0.9}{0.286} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.317}} & \scalebox{0.9}{0.337} & \scalebox{0.9}{0.326} & \scalebox{0.9}{0.338} & \scalebox{0.9}{0.324} & \scalebox{0.9}{0.349} & \scalebox{0.9}{0.330} & \scalebox{0.9}{0.342} & - & - & - & - & \scalebox{0.9}{0.425} & \scalebox{0.9}{0.343} & \scalebox{0.9}{0.324} & \scalebox{0.9}{\textbf{0.335}} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.230}} & \scalebox{0.9}{0.271} & \scalebox{0.9}{0.239} & \scalebox{0.9}{0.275} & \scalebox{0.9}{0.234} & \scalebox{0.9}{\textbf{0.265}} & \scalebox{0.9}{0.239} & \scalebox{0.9}{0.276} & - & - & - & - & \scalebox{0.9}{0.324} & \scalebox{0.9}{0.329} & \scalebox{0.9}{0.233} & \scalebox{0.9}{0.267} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{Electricity}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.133}} & \scalebox{0.9}{\textbf{0.223}} & \scalebox{0.9}{0.190} & \scalebox{0.9}{0.279} & \scalebox{0.9}{0.163} & \scalebox{0.9}{0.255} & \scalebox{0.9}{0.186} & \scalebox{0.9}{0.268} & - & - & - & - & \scalebox{0.9}{0.197} & \scalebox{0.9}{0.277} & \scalebox{0.9}{0.195} & \scalebox{0.9}{0.275} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{\textbf{0.147}} & \scalebox{0.9}{\textbf{0.237}} & \scalebox{0.9}{0.195} & \scalebox{0.9}{0.285} & \scalebox{0.9}{0.194} & \scalebox{0.9}{0.288} & \scalebox{0.9}{0.193} & \scalebox{0.9}{0.276} & - & - & - & - & \scalebox{0.9}{0.197} & \scalebox{0.9}{0.279} & \scalebox{0.9}{0.195} & \scalebox{0.9}{0.277} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.166}} & \scalebox{0.9}{\textbf{0.265}} & \scalebox{0.9}{0.211} & \scalebox{0.9}{0.301} & \scalebox{0.9}{0.201} & \scalebox{0.9}{0.298} & \scalebox{0.9}{0.206} & \scalebox{0.9}{0.289} & - & - & - & - & \scalebox{0.9}{0.211} & \scalebox{0.9}{0.295} & \scalebox{0.9}{0.210} & \scalebox{0.9}{0.294} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.203}} & \scalebox{0.9}{\textbf{0.297}} & \scalebox{0.9}{0.253} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.263} & \scalebox{0.9}{0.343} & \scalebox{0.9}{0.250} & \scalebox{0.9}{0.324} & - & - & - & - & \scalebox{0.9}{0.255} & \scalebox{0.9}{0.330} & \scalebox{0.9}{0.252} & \scalebox{0.9}{0.327} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg}  & \scalebox{0.9}{\textbf{0.162}} & \scalebox{0.9}{\textbf{0.256}} & \scalebox{0.9}{0.212} & \scalebox{0.9}{0.300} & \scalebox{0.9}{0.205} & \scalebox{0.9}{0.296} & \scalebox{0.9}{0.209} & \scalebox{0.9}{0.289} & - & - & - & - & \scalebox{0.9}{0.215} & \scalebox{0.9}{0.295} & \scalebox{0.9}{0.213} & \scalebox{0.9}{0.293} \\
    \midrule
    \multirow{5}{*}{\rotatebox{90}{\scalebox{0.9}{Traffic}}}
    & \scalebox{0.9}{96} & \scalebox{0.9}{\textbf{0.368}} & \scalebox{0.9}{\textbf{0.262}} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.309} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.298} & \scalebox{0.9}{0.595} & \scalebox{0.9}{0.360} & - & - & - & - & \scalebox{0.9}{0.378} & \scalebox{0.9}{0.365} & \scalebox{0.9}{0.480} & \scalebox{0.9}{0.357} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{0.373} & \scalebox{0.9}{\textbf{0.251}} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.308} & \scalebox{0.9}{0.445} & \scalebox{0.9}{0.301} & \scalebox{0.9}{0.576} & \scalebox{0.9}{0.353} & - & - & - & - & \scalebox{0.9}{\textbf{0.371}} & \scalebox{0.9}{0.352} & \scalebox{0.9}{0.439} & \scalebox{0.9}{0.336} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{\textbf{0.395}} & \scalebox{0.9}{\textbf{0.254}} & \scalebox{0.9}{0.490} & \scalebox{0.9}{0.315} & \scalebox{0.9}{0.492} & \scalebox{0.9}{0.320} & \scalebox{0.9}{0.569} & \scalebox{0.9}{0.362} & - & - & - & - & \scalebox{0.9}{0.467} & \scalebox{0.9}{0.354} & \scalebox{0.9}{0.460} & \scalebox{0.9}{0.344} \\
    & \scalebox{0.9}{720} & \scalebox{0.9}{\textbf{0.432}} & \scalebox{0.9}{\textbf{0.290}} & \scalebox{0.9}{0.524} & \scalebox{0.9}{0.332} & \scalebox{0.9}{0.514} & \scalebox{0.9}{0.321} & \scalebox{0.9}{0.603} & \scalebox{0.9}{0.372} & - & - & - & - & \scalebox{0.9}{0.525} & \scalebox{0.9}{0.378} & \scalebox{0.9}{0.499} & \scalebox{0.9}{0.364} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \scalebox{0.9}{\textbf{0.392}} & \scalebox{0.9}{\textbf{0.264}} & \scalebox{0.9}{0.490} & \scalebox{0.9}{0.316} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.310} & \scalebox{0.9}{0.586} & \scalebox{0.9}{0.362} & - & - & - & - & \scalebox{0.9}{0.435} & \scalebox{0.9}{0.362} & \scalebox{0.9}{0.470} & \scalebox{0.9}{0.350} \\
    \bottomrule
  \end{tabular}
    \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-10pt}
```
```{=latex}
\newpage
```
```{=latex}
\begin{table*}[h]\vspace{-10pt}
  \caption{Complete results of long-term forecasting tasks for the cross-domain setting of forecasting the future $O \in \{96,192,336,720\}$ time points based on the past 336 time points. \textbf{All the results of baselines are based on the encoder utilized in their original papers.} The standard deviations of SimMTM are within 0.005 for MSE and within 0.004 for MAE.}
  \label{tab:forecasting_crodomain_full}
  \vspace{-5pt}
  \vskip 0.15in
  \centering
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{1.2pt}
  \renewcommand\arraystretch{1.0}
  \begin{tabular}{cc|cccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{\scalebox{0.8}{Models}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{Random init.}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{Ti-MAE}} \cite{li2023ti-TiMAE}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{TST}} \cite{zerveas2021transformer-TST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{LaST}} \cite{wang2022learning-LaST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{TF-C}} \cite{Zhang2022-TF-C}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{CoST}} \cite{woo2022cost-CoST}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.8}{TS2Vec}} \cite{Yue2022-TS2Vec}} \\
    \cmidrule(lr){3-18}
    \multicolumn{2}{c}{\scalebox{0.8}{Metric}} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} \\
    \toprule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{ETTh2\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.8}{96} & \scalebox{0.8}{0.372} & \scalebox{0.8}{\textbf{0.401}} & \scalebox{0.8}{0.380} & \scalebox{0.8}{0.412} & \scalebox{0.8}{0.703} & \scalebox{0.8}{0.562} & \scalebox{0.8}{0.653} & \scalebox{0.8}{0.468} & \scalebox{0.8}{\textbf{0.362}} & \scalebox{0.8}{0.420} & \scalebox{0.8}{0.596} & \scalebox{0.8}{0.569} & \scalebox{0.8}{0.378} & \scalebox{0.8}{0.421} & \scalebox{0.8}{0.849} & \scalebox{0.8}{0.694} \\
    &  \scalebox{0.8}{192} & \scalebox{0.8}{\textbf{0.414}} & \scalebox{0.8}{\textbf{0.425}} & \scalebox{0.8}{0.416} & \scalebox{0.8}{0.434} & \scalebox{0.8}{0.715} & \scalebox{0.8}{0.567} & \scalebox{0.8}{0.658} & \scalebox{0.8}{0.502} & \scalebox{0.8}{0.426} & \scalebox{0.8}{0.478} & \scalebox{0.8}{0.614} & \scalebox{0.8}{0.621} & \scalebox{0.8}{0.424} & \scalebox{0.8}{0.451} & \scalebox{0.8}{0.909} & \scalebox{0.8}{0.738} \\
    &  \scalebox{0.8}{336} & \scalebox{0.8}{\textbf{0.429}} & \scalebox{0.8}{\textbf{0.436}} & \scalebox{0.8}{0.448} & \scalebox{0.8}{0.458} & \scalebox{0.8}{0.733} & \scalebox{0.8}{0.579} & \scalebox{0.8}{0.631} & \scalebox{0.8}{0.561} & \scalebox{0.8}{0.522} & \scalebox{0.8}{0.509} & \scalebox{0.8}{0.694} & \scalebox{0.8}{0.664} & \scalebox{0.8}{0.651} & \scalebox{0.8}{0.582} & \scalebox{0.8}{1.082} & \scalebox{0.8}{0.775} \\
    &  \scalebox{0.8}{720} & \scalebox{0.8}{\textbf{0.446}} & \scalebox{0.8}{\textbf{0.458}} & \scalebox{0.8}{0.481} & \scalebox{0.8}{0.487} & \scalebox{0.8}{0.762} & \scalebox{0.8}{0.622} & \scalebox{0.8}{0.638} & \scalebox{0.8}{0.608} & \scalebox{0.8}{0.460} & \scalebox{0.8}{0.478} & \scalebox{0.8}{0.635} & \scalebox{0.8}{0.683} & \scalebox{0.8}{0.883} & \scalebox{0.8}{0.701} & \scalebox{0.8}{0.934} & \scalebox{0.8}{0.769} \\
    \cmidrule(lr){2-18}
    &  \scalebox{0.8}{Avg} & \scalebox{0.8}{\textbf{0.415}} & \scalebox{0.8}{\textbf{0.430}} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.448} & \scalebox{0.8}{0.728} & \scalebox{0.8}{0.583} & \scalebox{0.8}{0.645} & \scalebox{0.8}{0.535} & \scalebox{0.8}{0.443} & \scalebox{0.8}{0.471} & \scalebox{0.8}{0.635} & \scalebox{0.8}{0.634} & \scalebox{0.8}{0.584} & \scalebox{0.8}{0.539} & \scalebox{0.8}{0.944} & \scalebox{0.8}{0.744} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{ETTm1\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.8}{96} & \scalebox{0.8}{ 0.367 } & \scalebox{0.8}{  0.398 } & \scalebox{0.8}{  0.380 } & \scalebox{0.8}{  0.412 } & \scalebox{0.8}{  0.715 } & \scalebox{0.8}{  0.581 } & \scalebox{0.8}{  0.627 } & \scalebox{0.8}{  0.477 } & \scalebox{0.8}{      \textbf{0.360} } & \scalebox{0.8}{      \textbf{0.374} } & \scalebox{0.8}{  0.666 } & \scalebox{0.8}{  0.647 } & \scalebox{0.8}{  0.423 } & \scalebox{0.8}{  0.450 } & \scalebox{0.8}{  0.991 } & \scalebox{0.8}{  0.765} \\
    &  \scalebox{0.8}{192} & \scalebox{0.8}{  0.396 } & \scalebox{0.8}{  0.421 } & \scalebox{0.8}{  0.416 } & \scalebox{0.8}{  0.434 } & \scalebox{0.8}{  0.729 } & \scalebox{0.8}{  0.587 } & \scalebox{0.8}{  0.628 } & \scalebox{0.8}{  0.500 } & \scalebox{0.8}{    \textbf{0.381} } & \scalebox{0.8}{      \textbf{0.371} } & \scalebox{0.8}{  0.672 } & \scalebox{0.8}{  0.653 } & \scalebox{0.8}{  0.641 } & \scalebox{0.8}{  0.578 } & \scalebox{0.8}{  0.829 } & \scalebox{0.8}{  0.699} \\
    &  \scalebox{0.8}{336} & \scalebox{0.8}{ 0.471 } & \scalebox{0.8}{      \textbf{0.437} } & \scalebox{0.8}{ \textbf{0.448} } & \scalebox{0.8}{  0.458 } & \scalebox{0.8}{  0.712 } & \scalebox{0.8}{  0.583 } & \scalebox{0.8}{  0.683 } & \scalebox{0.8}{  0.554 } & \scalebox{0.8}{  0.472 } & \scalebox{0.8}{  0.531 } & \scalebox{0.8}{  0.626 } & \scalebox{0.8}{  0.711 } & \scalebox{0.8}{  0.863 } & \scalebox{0.8}{  0.694 } & \scalebox{0.8}{  0.971 } & \scalebox{0.8}{  0.787} \\
    &  \scalebox{0.8}{720} & \scalebox{0.8}{  \textbf{0.454} } & \scalebox{0.8}{    \textbf{0.463} } & \scalebox{0.8}{  0.481 } & \scalebox{0.8}{  0.487 } & \scalebox{0.8}{  0.747 } & \scalebox{0.8}{  0.627 } & \scalebox{0.8}{  0.642 } & \scalebox{0.8}{  0.600 } & \scalebox{0.8}{  0.490 } & \scalebox{0.8}{  0.488 } & \scalebox{0.8}{  0.835 } & \scalebox{0.8}{  0.797 } & \scalebox{0.8}{  1.071 } & \scalebox{0.8}{  0.805 } & \scalebox{0.8}{  1.037 } & \scalebox{0.8}{  0.820} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{\textbf{0.422}} & \scalebox{0.8}{    \textbf{0.430} } & \scalebox{0.8}{  0.431 } & \scalebox{0.8}{  0.448 } & \scalebox{0.8}{  0.726 } & \scalebox{0.8}{  0.595 } & \scalebox{0.8}{  0.645 } & \scalebox{0.8}{  0.533 } & \scalebox{0.8}{  0.426 } & \scalebox{0.8}{  0.441 } & \scalebox{0.8}{  0.700 } & \scalebox{0.8}{  0.702 } & \scalebox{0.8}{  0.750 } & \scalebox{0.8}{  0.632 } & \scalebox{0.8}{  0.957 } & \scalebox{0.8}{  0.768} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{ETTm2\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.8}{96} & \scalebox{0.8}{  0.388 } & \scalebox{0.8}{  0.421 } & \scalebox{0.8}{  0.380 } & \scalebox{0.8}{    \textbf{0.412} } & \scalebox{0.8}{  0.699 } & \scalebox{0.8}{  0.566 } & \scalebox{0.8}{  0.559 } & \scalebox{0.8}{  0.489 } & \scalebox{0.8}{  0.428 } & \scalebox{0.8}{  0.454 } & \scalebox{0.8}{  0.968 } & \scalebox{0.8}{  0.738 } & \scalebox{0.8}{      \textbf{0.377} } & \scalebox{0.8}{  0.419 } & \scalebox{0.8}{  0.783 } & \scalebox{0.8}{  0.669} \\
    &  \scalebox{0.8}{192} & \scalebox{0.8}{  0.419 } & \scalebox{0.8}{     \textbf{0.423} } & \scalebox{0.8}{      \textbf{0.416} } & \scalebox{0.8}{  0.434 } & \scalebox{0.8}{  0.722 } & \scalebox{0.8}{  0.573 } & \scalebox{0.8}{  0.600 } & \scalebox{0.8}{  0.579 } & \scalebox{0.8}{  0.427 } & \scalebox{0.8}{  0.497 } & \scalebox{0.8}{  1.080 } & \scalebox{0.8}{  0.801 } & \scalebox{0.8}{  0.422 } & \scalebox{0.8}{  0.450 } & \scalebox{0.8}{  0.828 } & \scalebox{0.8}{  0.691} \\
    &  \scalebox{0.8}{336} & \scalebox{0.8}{    \textbf{0.435} } & \scalebox{0.8}{      \textbf{0.444} } & \scalebox{0.8}{  0.448 } & \scalebox{0.8}{  0.458 } & \scalebox{0.8}{  0.714 } & \scalebox{0.8}{  0.569 } & \scalebox{0.8}{  0.677 } & \scalebox{0.8}{  0.572 } & \scalebox{0.8}{  0.528 } & \scalebox{0.8}{  0.540 } & \scalebox{0.8}{  1.091 } & \scalebox{0.8}{  0.824 } & \scalebox{0.8}{  0.648 } & \scalebox{0.8}{  0.580 } & \scalebox{0.8}{  0.990 } & \scalebox{0.8}{  0.762} \\
    &  \scalebox{0.8}{720} & \scalebox{0.8}{    \textbf{0.468} } & \scalebox{0.8}{      \textbf{0.474} } & \scalebox{0.8}{  0.481 } & \scalebox{0.8}{  0.487 } & \scalebox{0.8}{  0.760 } & \scalebox{0.8}{  0.611 } & \scalebox{0.8}{  0.694 } & \scalebox{0.8}{  0.664 } & \scalebox{0.8}{  0.527 } & \scalebox{0.8}{  0.537  } & \scalebox{0.8}{  1.226 } & \scalebox{0.8}{  0.893 } & \scalebox{0.8}{  0.880 } & \scalebox{0.8}{  0.699 } & \scalebox{0.8}{  0.985 } & \scalebox{0.8}{  0.783} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{ \textbf{0.428} } & \scalebox{0.8}{      \textbf{0.441} } & \scalebox{0.8}{  0.431 } & \scalebox{0.8}{  0.448 } & \scalebox{0.8}{  0.724 } & \scalebox{0.8}{  0.580 } & \scalebox{0.8}{  0.632 } & \scalebox{0.8}{  0.576 } & \scalebox{0.8}{  0.503 } & \scalebox{0.8}{  0.507 } & \scalebox{0.8}{  1.091 } & \scalebox{0.8}{  0.814 } & \scalebox{0.8}{  0.582 } & \scalebox{0.8}{  0.537 } & \scalebox{0.8}{  0.896 } & \scalebox{0.8}{  0.726}\\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{Weather\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.8}{96} & \scalebox{0.8}{0.477} & \scalebox{0.8}{0.444} & \scalebox{0.8}{\textbf{0.380}} & \scalebox{0.8}{\textbf{0.412}} & - & - & - & - & - & - & - & - & - & - & - & - \\
    &  \scalebox{0.8}{192} & \scalebox{0.8}{0.454} & \scalebox{0.8}{0.522} & \scalebox{0.8}{\textbf{0.416}} & \scalebox{0.8}{\textbf{0.434}} & - & - & - & - & - & - & - & - & - & - & - & - \\
    &  \scalebox{0.8}{336} & \scalebox{0.8}{\textbf{0.424}} & \scalebox{0.8}{\textbf{0.434}} & \scalebox{0.8}{0.448} & \scalebox{0.8}{0.458} & - & - & - & - & - & - & - & - & - & - & - & - \\
    &  \scalebox{0.8}{720} & \scalebox{0.8}{\textbf{0.468}} & \scalebox{0.8}{\textbf{0.469}} & \scalebox{0.8}{0.481} & \scalebox{0.8}{0.487} & - & - & - & - & - & - & - & - & - & - & - & - \\
    \cmidrule(lr){2-18}
    &  \scalebox{0.8}{Avg} & \scalebox{0.8}{0.456} & \scalebox{0.8}{0.467} & \scalebox{0.8}{\textbf{0.431}} & \scalebox{0.8}{\textbf{0.448}} & - & - & - & - & - & - & - & - & - & - & - & - \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{ETTh1\\ $\downarrow$ \\ETTm1}$}}
    & \scalebox{0.8}{96} & \scalebox{0.8}{0.290} & \scalebox{0.8}{0.348} & \scalebox{0.8}{0.295} & \scalebox{0.8}{0.346} & \scalebox{0.8}{0.667} & \scalebox{0.8}{0.521} & \scalebox{0.8}{0.425} & \scalebox{0.8}{0.381} & \scalebox{0.8}{0.295} & \scalebox{0.8}{0.387} & \scalebox{0.8}{0.672} & \scalebox{0.8}{0.600} & \scalebox{0.8}{\textbf{0.248}} & \scalebox{0.8}{\textbf{0.332}} & \scalebox{0.8}{0.605} & \scalebox{0.8}{0.561} \\
    & \scalebox{0.8}{192} & \scalebox{0.8}{\textbf{0.327}} & \scalebox{0.8}{\textbf{0.372}} & \scalebox{0.8}{0.333} & \scalebox{0.8}{0.374} & \scalebox{0.8}{0.561} & \scalebox{0.8}{0.479} & \scalebox{0.8}{0.495} & \scalebox{0.8}{0.478} & \scalebox{0.8}{0.335} & \scalebox{0.8}{0.379} & \scalebox{0.8}{0.721} & \scalebox{0.8}{0.639} & \scalebox{0.8}{0.336} & \scalebox{0.8}{0.391} & \scalebox{0.8}{0.615} & \scalebox{0.8}{0.561} \\
    & \scalebox{0.8}{336} & \scalebox{0.8}{\textbf{0.357}} & \scalebox{0.8}{0.392} & \scalebox{0.8}{0.370} & \scalebox{0.8}{0.398} & \scalebox{0.8}{0.690} & \scalebox{0.8}{0.533} & \scalebox{0.8}{0.456} & \scalebox{0.8}{0.441} & \scalebox{0.8}{0.379} & \scalebox{0.8}{\textbf{0.363}} & \scalebox{0.8}{0.755} & \scalebox{0.8}{0.664} & \scalebox{0.8}{0.381} & \scalebox{0.8}{0.421} & \scalebox{0.8}{0.763} & \scalebox{0.8}{0.677} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{0.409} & \scalebox{0.8}{\textbf{0.423}} & \scalebox{0.8}{0.427} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.744} & \scalebox{0.8}{0.583} & \scalebox{0.8}{0.554} & \scalebox{0.8}{0.477} & \scalebox{0.8}{\textbf{0.403}} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.837} & \scalebox{0.8}{0.705} & \scalebox{0.8}{0.469} & \scalebox{0.8}{0.482} & \scalebox{0.8}{0.805} & \scalebox{0.8}{0.664} \\
    \cmidrule(lr){2-18} 
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{\textbf{0.346}} & \scalebox{0.8}{\textbf{0.384}} & \scalebox{0.8}{0.356} & \scalebox{0.8}{0.387} & \scalebox{0.8}{0.666} & \scalebox{0.8}{0.529} & \scalebox{0.8}{0.482} & \scalebox{0.8}{0.444} & \scalebox{0.8}{0.353} & \scalebox{0.8}{0.390} & \scalebox{0.8}{0.746} & \scalebox{0.8}{0.652} & \scalebox{0.8}{0.359} & \scalebox{0.8}{0.407} & \scalebox{0.8}{0.697} & \scalebox{0.8}{0.616} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{ETTh2\\ $\downarrow$ \\ETTm1}$}} 
    & \scalebox{0.8}{96} & \scalebox{0.8}{0.322} & \scalebox{0.8}{0.347} & \scalebox{0.8}{0.295} & \scalebox{0.8}{0.346} & \scalebox{0.8}{0.658} & \scalebox{0.8}{0.505} & \scalebox{0.8}{0.449} & \scalebox{0.8}{0.343} & \scalebox{0.8}{0.314} & \scalebox{0.8}{0.396} & \scalebox{0.8}{0.677} & \scalebox{0.8}{0.603} & \scalebox{0.8}{\textbf{0.253}} & \scalebox{0.8}{\textbf{0.342}} & \scalebox{0.8}{0.466} & \scalebox{0.8}{0.480} \\
    & \scalebox{0.8}{192} & \scalebox{0.8}{\textbf{0.332}} & \scalebox{0.8}{\textbf{0.372}} & \scalebox{0.8}{0.333} & \scalebox{0.8}{0.374} & \scalebox{0.8}{0.594} & \scalebox{0.8}{0.511} & \scalebox{0.8}{0.477} & \scalebox{0.8}{0.407} & \scalebox{0.8}{0.587} & \scalebox{0.8}{0.545} & \scalebox{0.8}{0.718} & \scalebox{0.8}{0.638} & \scalebox{0.8}{0.367} & \scalebox{0.8}{0.392} & \scalebox{0.8}{0.557} & \scalebox{0.8}{0.532} \\
    & \scalebox{0.8}{336} & \scalebox{0.8}{0.394} & \scalebox{0.8}{0.391} & \scalebox{0.8}{\textbf{0.370}} & \scalebox{0.8}{\textbf{0.398}} & \scalebox{0.8}{0.732} & \scalebox{0.8}{0.532} & \scalebox{0.8}{0.407} & \scalebox{0.8}{0.519} & \scalebox{0.8}{0.631} & \scalebox{0.8}{0.584} & \scalebox{0.8}{0.755} & \scalebox{0.8}{0.663} & \scalebox{0.8}{0.388} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.646} & \scalebox{0.8}{0.576} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{\textbf{0.411}} & \scalebox{0.8}{\textbf{0.424}} & \scalebox{0.8}{0.427} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.768} & \scalebox{0.8}{0.592} & \scalebox{0.8}{0.557} & \scalebox{0.8}{0.523} & \scalebox{0.8}{0.368} & \scalebox{0.8}{0.429} & \scalebox{0.8}{0.848} & \scalebox{0.8}{0.712} & \scalebox{0.8}{0.498} & \scalebox{0.8}{0.488} & \scalebox{0.8}{0.752} & \scalebox{0.8}{0.638} \\
    \cmidrule(lr){2-18} 
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{0.365} & \scalebox{0.8}{0.384} & \textbf{\scalebox{0.8}{0.356}} & \textbf{\scalebox{0.8}{0.387}} & \scalebox{0.8}{0.688} & \scalebox{0.8}{0.535} & \scalebox{0.8}{0.472} & \scalebox{0.8}{0.448} & \scalebox{0.8}{0.475} & \scalebox{0.8}{0.489} & \scalebox{0.8}{0.750} & \scalebox{0.8}{0.654} & \scalebox{0.8}{0.377} & \scalebox{0.8}{0.413} & \scalebox{0.8}{0.606} & \scalebox{0.8}{0.556} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{ETTm2\\ $\downarrow$ \\ETTm1}$}} 
    &  \scalebox{0.8}{96} & \scalebox{0.8}{0.297} & \scalebox{0.8}{0.348} & \scalebox{0.8}{0.295} & \scalebox{0.8}{0.346} & \scalebox{0.8}{0.647} & \scalebox{0.8}{0.497} & \scalebox{0.8}{0.471} & \scalebox{0.8}{0.422} & \scalebox{0.8}{0.304} & \scalebox{0.8}{0.388} & \scalebox{0.8}{0.610} & \scalebox{0.8}{0.577} & \textbf{\scalebox{0.8}{0.239}} & \textbf{\scalebox{0.8}{0.331}} & \scalebox{0.8}{0.586} & \scalebox{0.8}{0.515} \\
    & \scalebox{0.8}{192} & \textbf{\scalebox{0.8}{0.332}} & \textbf{\scalebox{0.8}{0.370}} & \scalebox{0.8}{0.333} & \scalebox{0.8}{0.374} & \scalebox{0.8}{0.597} & \scalebox{0.8}{0.508} & \scalebox{0.8}{0.495} & \scalebox{0.8}{0.442} & \scalebox{0.8}{0.429} & \scalebox{0.8}{0.494} & \scalebox{0.8}{0.725} & \scalebox{0.8}{0.657} & \scalebox{0.8}{0.339} & \scalebox{0.8}{0.371} & \scalebox{0.8}{0.624} & \scalebox{0.8}{0.562} \\
    & \scalebox{0.8}{336} & \textbf{\scalebox{0.8}{0.364}} & \textbf{\scalebox{0.8}{0.393}} & \scalebox{0.8}{0.370} & \scalebox{0.8}{0.398} & \scalebox{0.8}{0.700} & \scalebox{0.8}{0.525} & \scalebox{0.8}{0.455} & \scalebox{0.8}{0.424} & \scalebox{0.8}{0.499} & \scalebox{0.8}{0.523} & \scalebox{0.8}{0.768} & \scalebox{0.8}{0.684} & \scalebox{0.8}{0.371} & \scalebox{0.8}{0.421} & \scalebox{0.8}{1.035} & \scalebox{0.8}{0.806} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{\textbf{0.410}} & \scalebox{0.8}{\textbf{0.421}} & \scalebox{0.8}{0.427} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.786} & \scalebox{0.8}{0.596} & \scalebox{0.8}{0.498} & \scalebox{0.8}{0.532} & \scalebox{0.8}{0.422} & \scalebox{0.8}{0.450} & \scalebox{0.8}{0.927} & \scalebox{0.8}{0.759} & \scalebox{0.8}{0.467} & \scalebox{0.8}{0.481} & \scalebox{0.8}{0.780} & \scalebox{0.8}{0.669} \\
    \cmidrule(lr){2-18}  
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{\textbf{0.351}} & \scalebox{0.8}{\textbf{0.383}} & \scalebox{0.8}{0.356} & \scalebox{0.8}{0.387} & \scalebox{0.8}{0.682} & \scalebox{0.8}{0.531} & \scalebox{0.8}{0.480} & \scalebox{0.8}{0.455} & \scalebox{0.8}{0.414} & \scalebox{0.8}{0.464} & \scalebox{0.8}{0.758} & \scalebox{0.8}{0.669} & \scalebox{0.8}{0.354} & \scalebox{0.8}{0.401} & \scalebox{0.8}{0.756} & \scalebox{0.8}{0.638} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{$\shortstack{Weather\\ $\downarrow$ \\ETTm1}$}} 
    &  \scalebox{0.8}{96} & \scalebox{0.8}{0.304} & \scalebox{0.8}{0.354} & \scalebox{0.8}{\textbf{0.295}} & \scalebox{0.8}{\textbf{0.346}} & - & - & - & - & - & - & - & - & - & - & - & - \\
    &  \scalebox{0.8}{192} & \scalebox{0.8}{0.338} & \scalebox{0.8}{0.375} & \scalebox{0.8}{\textbf{0.333}} & \scalebox{0.8}{\textbf{0.374}} & - & - & - & - & - & - & - & - & - & - & - & - \\
    &  \scalebox{0.8}{336} & \scalebox{0.8}{0.371} & \scalebox{0.8}{\textbf{0.397}} & \scalebox{0.8}{\textbf{0.370}} & \scalebox{0.8}{0.398} & - & - & - & - & - & - & - & - & - & - & - & - \\
    &  \scalebox{0.8}{720} & \scalebox{0.8}{\textbf{0.417}} & \scalebox{0.8}{\textbf{0.426}} & \scalebox{0.8}{0.427} & \scalebox{0.8}{0.431} & - & - & - & - & - & - & - & - & - & - & - & - \\
    \cmidrule(lr){2-18} 
    &  \scalebox{0.8}{Avg} & \scalebox{0.8}{0.358} & \scalebox{0.8}{0.388} & \scalebox{0.8}{\textbf{0.356}} & \scalebox{0.8}{\textbf{0.387}} & - & - & - & - & - & - & - & - & - & - & - & - \\
    \bottomrule
  \end{tabular}
  \end{small}
  \vspace{-2pt}
  %\vspace{-1.0em}
\end{table*}
```
```{=latex}
\newpage
```
```{=latex}
\begin{table*}[h]\vspace{-10pt}
  \caption{Complete results of long-term forecasting tasks for the in-domain setting. \textbf{All the results
of baseline are based on the unified channel-independent Transformer encoder.} The past sequence length is set as 336. The unified channel-independent transformer model can perform the transfer experiment between datasets with different variables. The standard deviations of SimMTM are within 0.005 for MSE and within 0.004 for MAE.}
  \label{tab:forecasting_crodomain_full_simmtm}
  \vspace{-5pt}
  \vskip 0.15in
  \centering
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{1.5pt}
  \renewcommand\arraystretch{1.0}
  \begin{tabular}{c|c|cccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{Models} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Random init.}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Ti-MAE}} \cite{li2023ti-TiMAE}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TST}} \cite{zerveas2021transformer-TST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{LaST}} \cite{wang2022learning-LaST}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TF-C}} \cite{Zhang2022-TF-C}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{CoST}} \cite{woo2022cost-CoST}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TS2Vec}} \cite{Yue2022-TS2Vec}} \\
    \cmidrule(lr){3-18}
    \multicolumn{2}{c}{Metric} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} \\
    \toprule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{ETTh2\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.9}{96} & \textbf{\scalebox{0.9}{0.372}} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.424} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.425} & - & - & - & - & \scalebox{0.9}{0.376} & \textbf{\scalebox{0.9}{0.362}} & \scalebox{0.9}{0.436} & \scalebox{0.9}{0.430} \\
    &  \scalebox{0.9}{192} & \scalebox{0.9}{0.414} & \scalebox{0.9}{0.425} & \scalebox{0.9}{0.416} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.454} & \scalebox{0.9}{0.440} & \scalebox{0.9}{0.531} & \scalebox{0.9}{0.484} & - & - & - & - & \textbf{\scalebox{0.9}{0.376}} & \textbf{\scalebox{0.9}{0.362}} & \scalebox{0.9}{0.455} & \scalebox{0.9}{0.440} \\
    &  \scalebox{0.9}{336} & \textbf{\scalebox{0.9}{0.429}} & \textbf{\scalebox{0.9}{0.436}} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.497} & \scalebox{0.9}{0.469} & \scalebox{0.9}{0.474} & \scalebox{0.9}{0.459} & - & - & - & - & \scalebox{0.9}{0.444} & \scalebox{0.9}{0.444} & \scalebox{0.9}{0.689} & \scalebox{0.9}{0.584} \\
    &  \scalebox{0.9}{720} & \textbf{\scalebox{0.9}{0.446}} & \textbf{\scalebox{0.9}}{0.458} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.515} & \scalebox{0.9}{0.492} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.469} & - & - & - & - & \scalebox{0.9}{0.517} & \scalebox{0.9}{0.510} & \scalebox{0.9}{0.489} & \scalebox{0.9}{0.490} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \textbf{\scalebox{0.9}{0.415}} & \textbf{\scalebox{0.9}{0.430}} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.466} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.469} & \scalebox{0.9}{0.459} & - & - & - & - & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.517} & \scalebox{0.9}{0.486} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{ETTm1\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.9}{96} & \textbf{\scalebox{0.9}{0.367}} & \textbf{\scalebox{0.9}{0.398}} & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.400} & \scalebox{0.9}{0.418} & \scalebox{0.9}{0.443} & \scalebox{0.9}{0.440} & - & - & - & - & \scalebox{0.9}{0.465} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.413} & \scalebox{0.9}{0.443} \\
    &  \scalebox{0.9}{192} & \textbf{\scalebox{0.9}{0.396}} & \textbf{\scalebox{0.9}{0.421}} & \scalebox{0.9}{0.416} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.445} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.455} & - & - & - & - & \scalebox{0.9}{0.722} & \scalebox{0.9}{0.588} & \scalebox{0.9}{0.459} & \scalebox{0.9}{0.465} \\
    &  \scalebox{0.9}{336} & \scalebox{0.9}{0.471} & \textbf{\scalebox{0.9}{0.437}} & \textbf{\scalebox{0.9}{0.448}} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.510} & \scalebox{0.9}{0.467} & \scalebox{0.9}{0.462} & \scalebox{0.9}{0.455} & - & - & - & - & \scalebox{0.9}{0.712} & \scalebox{0.9}{0.586} & \scalebox{0.9}{0.614} & \scalebox{0.9}{0.554} \\
    &  \scalebox{0.9}{720} & \scalebox{0.9}{0.454} & \textbf{\scalebox{0.9}{0.463}} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.636} & \scalebox{0.9}{0.544} & \scalebox{0.9}{0.525} & \scalebox{0.9}{0.503} & - & - & - & - & \scalebox{0.9}{0.581} & \scalebox{0.9}{0.533} & \textbf{\scalebox{0.9}{0.450}} & \scalebox{0.9}{0.464} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \textbf{\scalebox{0.9}{0.422}} & \textbf{\scalebox{0.9}{0.430}} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.495} & \scalebox{0.9}{0.469} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.463} & - & - & - & - & \scalebox{0.9}{0.620} & \scalebox{0.9}{0.541} & \scalebox{0.9}{0.484} & \scalebox{0.9}{0.482} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{ETTm2\\ $\downarrow$ \\ETTh1}$}}
    &  \scalebox{0.9}{96} & \scalebox{0.9}{0.388} & \scalebox{0.9}{0.421} & \textbf{\scalebox{0.9}{0.380}} & \textbf{\scalebox{0.9}{0.412}} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.389} & \scalebox{0.9}{0.413} & - & - & - & - & \scalebox{0.9}{0.403} & \scalebox{0.9}{0.426} & \scalebox{0.9}{0.483} & \scalebox{0.9}{0.480} \\
    &  \scalebox{0.9}{192} & \scalebox{0.9}{0.419} & \textbf{\scalebox{0.9}{0.423}} & \textbf{\scalebox{0.9}{0.416}} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.474} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.463} & \scalebox{0.9}{0.452} & - & - & - & - & \scalebox{0.9}{0.457} & \scalebox{0.9}{0.468} & \scalebox{0.9}{0.579} & \scalebox{0.9}{0.537} \\
    &  \scalebox{0.9}{336} & \textbf{\scalebox{0.9}{0.435}} & \textbf{\scalebox{0.9}}{0.444} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.515} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.492} & \scalebox{0.9}{0.465} & - & - & - & - & \scalebox{0.9}{0.794} & \scalebox{0.9}{0.682} & \scalebox{0.9}{0.673} & \scalebox{0.9}{0.563} \\
    &  \scalebox{0.9}{720} & \textbf{\scalebox{0.9}{0.468}} & \textbf{\scalebox{0.9}{0.474}} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.496} & \scalebox{0.9}{0.488} & \scalebox{0.9}{0.468} & \scalebox{0.9}{0.468} & - & - & - & - & \scalebox{0.9}{0.739} & \scalebox{0.9}{0.617} & \scalebox{0.9}{0.729} & \scalebox{0.9}{0.620} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \textbf{\scalebox{0.9}{0.428}} & \textbf{\scalebox{0.9}{0.441}} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.464} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.453} & \scalebox{0.9}{0.450} & - & - & - & - & \scalebox{0.9}{0.598} & \scalebox{0.9}{0.548} & \scalebox{0.9}{0.616} & \scalebox{0.9}{0.550} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{Weather\\ $\downarrow$ \\ETTh1}$}} 
    &  \scalebox{0.9}{96} & \scalebox{0.9}{0.477} & \scalebox{0.9}{0.444} & \textbf{\scalebox{0.9}{0.380}} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.397} & \scalebox{0.9}{0.440} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.429} & - & - & - & - & \scalebox{0.9}{0.421} & \scalebox{0.9}{0.410} & \scalebox{0.9}{0.393} & \textbf{\scalebox{0.9}{0.410}} \\
    &  \scalebox{0.9}{192} & \scalebox{0.9}{0.454} & \scalebox{0.9}{0.522} & \textbf{\scalebox{0.9}{0.416}} & \textbf{\scalebox{0.9}{0.434}} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.466} & \scalebox{0.9}{0.461} & \scalebox{0.9}{0.451} & - & - & - & - & \scalebox{0.9}{0.539} & \scalebox{0.9}{0.503} & \scalebox{0.9}{0.440} & \scalebox{0.9}{0.437} \\
    &  \scalebox{0.9}{336} & \textbf{\scalebox{0.9}{0.424}} & \textbf{\scalebox{0.9}{0.434}} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.479} & \scalebox{0.9}{0.458} & \scalebox{0.9}{0.463} & \scalebox{0.9}{0.456} & - & - & - & - & \scalebox{0.9}{0.568} & \scalebox{0.9}{0.514} & \scalebox{0.9}{0.450} & \scalebox{0.9}{0.451} \\
    &  \scalebox{0.9}{720} & \textbf{\scalebox{0.9}{0.468}} & \textbf{\scalebox{0.9}{0.469}} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.515} & \scalebox{0.9}{0.492} & \scalebox{0.9}{0.507} & \scalebox{0.9}{0.489} & - & - & - & - & \scalebox{0.9}{0.544} & \scalebox{0.9}{0.522} & \scalebox{0.9}{0.567} & \scalebox{0.9}{0.541} \\
    \cmidrule(lr){2-18}
    &  \scalebox{0.9}{Avg} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.467} & \textbf{\scalebox{0.9}{0.431}} & \textbf{\scalebox{0.9}{0.448}} & \scalebox{0.9}{0.462} & \scalebox{0.9}{0.464} & \scalebox{0.9}{0.465} & \scalebox{0.9}{0.456} & - & - & - & - & \scalebox{0.9}{0.518} & \scalebox{0.9}{0.487} & \scalebox{0.9}{0.463} & \scalebox{0.9}{0.460} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{ETTh1\\ $\downarrow$ \\ETTm1}$}}
    &  \scalebox{0.9}{96} & \textbf{\scalebox{0.9}{0.290}} & \scalebox{0.9}{0.348} & \scalebox{0.9}{0.295} & \textbf{\scalebox{0.9}{0.346}} & \scalebox{0.9}{0.311} & \scalebox{0.9}{0.355} & \scalebox{0.9}{0.315} & \scalebox{0.9}{0.354} & - & - & - & - & \scalebox{0.9}{0.308} & \scalebox{0.9}{0.355} & \scalebox{0.9}{0.681} & \scalebox{0.9}{0.545} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{0.327} & \textbf{\scalebox{0.9}{0.372}} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.337} & \textbf{\scalebox{0.9}{0.372}} & \scalebox{0.9}{0.365} & \scalebox{0.9}{0.391} & - & - & - & - & \scalebox{0.9}{0.357} & \scalebox{0.9}{0.390} & \scalebox{0.9}{0.689} & \scalebox{0.9}{0.551} \\
    & \scalebox{0.9}{336} & \textbf{\scalebox{0.9}{0.357}} & \textbf{\scalebox{0.9}{0.392}} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.372} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.384} & \scalebox{0.9}{0.400} & - & - & - & - & \scalebox{0.9}{0.396} & \scalebox{0.9}{0.402} & \scalebox{0.9}{0.705} & \scalebox{0.9}{0.560} \\
    & \scalebox{0.9}{720} & \textbf{\scalebox{0.9}{0.409}} & \textbf{\scalebox{0.9}{0.423}} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.422} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.426} & - & - & - & - & \scalebox{0.9}{0.419} & \scalebox{0.9}{0.423} & \scalebox{0.9}{0.722} & \scalebox{0.9}{0.571} \\
    \cmidrule(lr){2-18}
    & \scalebox{0.9}{Avg} & \textbf{\scalebox{0.9}{0.346}} & \textbf{\scalebox{0.9}{0.384}} & \scalebox{0.9}{0.356} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.360} & \scalebox{0.9}{0.390} & \scalebox{0.9}{0.373} & \scalebox{0.9}{0.393} & - & - & - & - & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.393} & \scalebox{0.9}{0.699} & \scalebox{0.9}{0.557} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{ETTh2\\ $\downarrow$ \\ETTm1}$}} 
    & \scalebox{0.9}{96} & \scalebox{0.9}{0.322} & \scalebox{0.9}{0.347} & \textbf{\scalebox{0.9}{0.295}} & \textbf{\scalebox{0.9}{0.346}} & \scalebox{0.9}{0.323} & \scalebox{0.9}{0.362} & \scalebox{0.9}{0.338} & \scalebox{0.9}{0.383} & - & - & - & - & \scalebox{0.9}{0.322} & \scalebox{0.9}{0.351} & \scalebox{0.9}{0.679} & \scalebox{0.9}{0.546} \\
    & \scalebox{0.9}{192} & \scalebox{0.9}{0.332} & \textbf{\scalebox{0.9}{0.372}} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.395} & \scalebox{0.9}{0.394} & \scalebox{0.9}{0.408} & - & - & - & - & \textbf{\scalebox{0.9}{0.331}} & \scalebox{0.9}{0.373} & \scalebox{0.9}{0.673} & \scalebox{0.9}{0.551} \\
    & \scalebox{0.9}{336} & \scalebox{0.9}{0.394} & \textbf{\scalebox{0.9}{0.391}} & \textbf{\scalebox{0.9}{0.370}} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.397} & \scalebox{0.9}{0.413} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.412} & - & - & - & - & \scalebox{0.9}{0.382} & \scalebox{0.9}{0.397} & \scalebox{0.9}{0.703} & \scalebox{0.9}{0.557} \\
    & \scalebox{0.9}{720} & \textbf{\scalebox{0.9}{0.411}} & \textbf{\scalebox{0.9}{0.424}} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.442} & \scalebox{0.9}{0.439} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.432} & - & - & - & - & \scalebox{0.9}{0.417} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.722} & \scalebox{0.9}{0.573} \\
    \cmidrule(lr){2-18} 
    & \scalebox{0.9}{Avg} & \scalebox{0.9}{0.365} & \textbf{\scalebox{0.9}{0.384}} & \textbf{\scalebox{0.9}{0.356}} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.383} & \scalebox{0.9}{0.402} & \scalebox{0.9}{0.391} & \scalebox{0.9}{0.409} & - & - & - & - & \scalebox{0.9}{0.363} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.694} & \scalebox{0.9}{0.557} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{ETTm2\\ $\downarrow$ \\ETTm1}$}} 
    &  \scalebox{0.9}{96} & \scalebox{0.9}{0.297} & \scalebox{0.9}{0.348} & \textbf{\scalebox{0.9}{0.295}} & \textbf{\scalebox{0.9}{0.346}} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.378} & \scalebox{0.9}{0.327} & \scalebox{0.9}{0.364} & - & - & - & - & \scalebox{0.9}{0.320} & \scalebox{0.9}{0.364} & \scalebox{0.9}{0.422} & \scalebox{0.9}{0.434} \\
    & \scalebox{0.9}{192} & \textbf{\scalebox{0.9}{0.332}} & \textbf{\scalebox{0.9}{0.370}} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.381} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.362} & \scalebox{0.9}{0.389} & - & - & - & - & \scalebox{0.9}{0.367} & \scalebox{0.9}{0.386} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.371} \\
    & \scalebox{0.9}{336} & \textbf{\scalebox{0.9}{0.364}} & \textbf{\scalebox{0.9}{0.393}} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.394} & \scalebox{0.9}{0.413} & \scalebox{0.9}{0.401} & \scalebox{0.9}{0.418} & - & - & - & - & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.394} & \scalebox{0.9}{0.402} & \scalebox{0.9}{0.444} \\
    & \scalebox{0.9}{720} & \textbf{\scalebox{0.9}{0.410}} & \textbf{\scalebox{0.9}{0.421}} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.455} & \scalebox{0.9}{0.453} & \scalebox{0.9}{0.437} & \scalebox{0.9}{0.437} & - & - & - & - & \scalebox{0.9}{0.479} & \scalebox{0.9}{0.503} & \scalebox{0.9}{0.481} & \scalebox{0.9}{0.432} \\
    \cmidrule(lr){2-18}   
    &  \scalebox{0.9}{Avg} & \textbf{\scalebox{0.9}{0.351}} & \textbf{\scalebox{0.9}{0.383}} & \scalebox{0.9}{0.356} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.390} & \scalebox{0.9}{0.410} & \scalebox{0.9}{0.382} & \scalebox{0.9}{0.402} & - & - & - & - & \scalebox{0.9}{0.385} & \scalebox{0.9}{0.412} & \scalebox{0.9}{0.423} & \scalebox{0.9}{0.420} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.9}{$\shortstack{Weather\\ $\downarrow$ \\ETTm1}$}} 
    &  \scalebox{0.9}{96} & \textbf{\scalebox{0.9}{0.294}} & \scalebox{0.9}{0.354} & \scalebox{0.9}{0.295} & \textbf{\scalebox{0.9}{0.346}} & \scalebox{0.9}{0.338} & \scalebox{0.9}{0.380} & \scalebox{0.9}{0.324} & \scalebox{0.9}{0.366} & - & - & - & - & \scalebox{0.9}{0.324} & \scalebox{0.9}{0.360} & \scalebox{0.9}{0.329} & \scalebox{0.9}{0.359} \\
    &  \scalebox{0.9}{192} & \textbf{\scalebox{0.9}{0.318}} & \textbf{\scalebox{0.9}{0.355}} & \scalebox{0.9}{0.333} & \scalebox{0.9}{0.374} & \scalebox{0.9}{0.473} & \scalebox{0.9}{0.457} & \scalebox{0.9}{0.349} & \scalebox{0.9}{0.377} & - & - & - & - & \scalebox{0.9}{0.359} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.392} & \scalebox{0.9}{0.392} \\
    &  \scalebox{0.9}{336} & \textbf{\scalebox{0.9}{0.361}} & \textbf{\scalebox{0.9}{0.397}} & \scalebox{0.9}{0.370} & \scalebox{0.9}{0.398} & \scalebox{0.9}{0.402} & \scalebox{0.9}{0.415} & \scalebox{0.9}{0.378} & \scalebox{0.9}{0.398} & - & - & - & - & \scalebox{0.9}{0.395} & \scalebox{0.9}{0.399} & \scalebox{0.9}{0.372} & \scalebox{0.9}{0.400} \\
    &  \scalebox{0.9}{720} & \scalebox{0.9}{0.427} & \textbf{\scalebox{0.9}{0.426}} & \scalebox{0.9}{0.427} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.432} & \scalebox{0.9}{0.438} & \textbf{\scalebox{0.9}{0.422}} & \scalebox{0.9}{0.427} & - & - & - & - & \scalebox{0.9}{0.450} & \scalebox{0.9}{0.467} & \scalebox{0.9}{0.434} & \scalebox{0.9}{0.429} \\
    \cmidrule(lr){2-18} 
    &  \scalebox{0.9}{Avg} & \textbf{\scalebox{0.9}{0.350}} & \textbf{\scalebox{0.9}{0.383}} & \scalebox{0.9}{0.356} & \scalebox{0.9}{0.387} & \scalebox{0.9}{0.411} & \scalebox{0.9}{0.423} & \scalebox{0.9}{0.368} & \scalebox{0.9}{0.392} & - & - & - & - & \scalebox{0.9}{0.382} & \scalebox{0.9}{0.403} & \scalebox{0.9}{0.382} & \scalebox{0.9}{0.395} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \vspace{-2pt}
  %\vspace{-1.0em}
\end{table*}
```
```{=latex}
\newpage
```
```{=latex}
\begin{table*}[ht]\vspace{-9pt}
  \caption{Full ablation studies for the in-domain setting of forecasting. The standard deviations of SimMTM are within 0.005 for MSE and within 0.004 for MAE.}\label{tab:forecast_in_domain_abs_full}
  \label{tab:lra}
  \vspace{-5pt}
  \vskip 0.15in
  \centering
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{9pt}
  \renewcommand\arraystretch{0.5}
  \begin{tabular}{c|c|ccccccccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{Input-336} & \multicolumn{2}{c}{Supervised} & \multicolumn{2}{c}{W/o ${\cal L}_{\text{reconstruction}}$} & \multicolumn{2}{c}{W/o ${\cal L}_{\text{constraint}}$} & \multicolumn{2}{c}{\textbf{SimMTM}}  \\
    \cmidrule(lr){3-4} \cmidrule(lr){5-6} \cmidrule(lr){7-8} \cmidrule(lr){9-10}
    \multicolumn{2}{c}{Metric} & MSE & MAE & MSE & MAE & MSE & MAE & MSE & MAE \\
    \toprule
    \multirow{5}{*}{ETTh1}
    &  96 & 0.380 & 0.412 & \textbf{0.377} & 0.408 & 0.381 & 0.409 & 0.379 & \textbf{0.407} \\
    & 192 & 0.416 & 0.434 & 0.419 & 0.443 & \textbf{0.409} & 0.443 & 0.412 & \textbf{0.424} \\
    & 336 & 0.448 & 0.458 & 0.423 & 0.434 & 0.432 & 0.444 & \textbf{0.421} & \textbf{0.431} \\
    & 720 & 0.481 & 0.487 & 0.437 & 0.454 & 0.447 & 0.454 & \textbf{0.424} & \textbf{0.449} \\
    \cmidrule(lr){2-10}
    & Avg & 0.431 & 0.448 & 0.414 & 0.435 & 0.417 & 0.438 & \textbf{0.409} & \textbf{0.428} \\
    \midrule
    \multirow{5}{*}{ETTh2}  
    & 96 & 0.325 & 0.374 & \textbf{0.288} & \textbf{0.344} & 0.312 & 0.365 & 0.293 & 0.347 \\
    & 192 & 0.400 & 0.424 & 0.356 & 0.391 & 0.389 & 0.418 & \textbf{0.355} & \textbf{0.386} \\
    & 336 & 0.405 & 0.433 & \textbf{0.368} & 0.406 & 0.396 & 0.432 & 0.370 & \textbf{0.401} \\
    & 720 & 0.451 & 0.475 & 0.409 & 0.432 & 0.448 & 0.479 & \textbf{0.395} &  \textbf{0.427}\\
    \cmidrule(lr){2-10} 
    & Avg & 0.395 & 0.427 & 0.355 & 0.393 & 0.386 & 0.424 & \textbf{0.353} & \textbf{0.390} \\
    \midrule
    \multirow{5}{*}{ETTm1} 
    &  96 & 0.295 & 0.346 & 0.291 & 0.343 & \textbf{0.282} & \textbf{0.337} & 0.288 & 0.348 \\
    & 192 & 0.333 & 0.374 & 0.330 & 0.390 & \textbf{0.324} & 0.388 & 0.327 & \textbf{0.373} \\
    & 336 & 0.370 & 0.398 & 0.369 & 0.399 & 0.366 & 0.397 & \textbf{0.363} & \textbf{0.395} \\
    & 720 & 0.427 & 0.431 & 0.417 & 0.429 & 0.424 & 0.435 & \textbf{0.412} & \textbf{0.424} \\
    \cmidrule(lr){2-10} 
    & Avg & 0.356 & 0.387 & 0.352 & 0.390 & 0.349 & 0.389 & \textbf{0.348} & \textbf{0.385} \\
    \midrule
    \multirow{5}{*}{ETTm2} 
    &  96 & 0.175 & 0.268 & 0.174 & 0.265 & \textbf{0.170} & \textbf{0.261} & 0.172 & \textbf{0.261} \\
    & 192 & 0.240 & 0.312 & 0.232 & 0.303 & 0.244 & 0.320 & \textbf{0.223} & \textbf{0.300} \\
    & 336 & 0.298 & 0.351 & 0.313 & 0.365 & \textbf{0.279} & 0.334 & 0.282 & \textbf{0.331} \\
    & 720 & 0.403 & 0.413 & 0.376 & 0.451 & 0.376 & \textbf{0.378} & \textbf{0.374} & 0.388 \\
    \cmidrule(lr){2-10}
    & Avg & 0.279 & 0.336 & 0.274 & 0.346 & 0.267 & 0.323 & \textbf{0.263} & \textbf{0.320} \\
    \midrule
    \multirow{5}{*}{Weather} 
    &  96 & 0.166 & 0.216 & 0.164 & \textbf{0.209} & 0.160 & 0.212 & \textbf{0.158} & 0.211 \\
    & 192 & 0.208 & 0.254 & 0.203 & 0.258 & 0.203 & 0.251 & \textbf{0.199} & \textbf{0.249} \\
    & 336 & 0.257 & 0.290 & 0.244 & 0.289 & 0.253 & 0.290 & \textbf{0.246} & \textbf{0.286} \\
    & 720 & 0.326 & 0.338 & 0.322 & 0.343 & 0.325 & 0.340 & \textbf{0.317} & \textbf{0.337} \\
    \cmidrule(lr){2-10}  
    & Avg & 0.239 & 0.275 & 0.233 & 0.275 & 0.235 & 0.273 & \textbf{0.230} & \textbf{0.271} \\
    \midrule
    \multirow{5}{*}{Electricity} 
    &  96 & 0.190 & 0.279 & 0.177 & 0.270 & 0.134 & \textbf{0.220} & \textbf{0.133} & 0.223 \\
    & 192 & 0.195 & 0.285 & 0.184 & 0.279 & 0.163 & 0.274 & \textbf{0.147} & \textbf{0.237} \\
    & 336 & 0.211 & 0.301 & 0.202 & 0.300 & 0.223 & 0.311 & \textbf{0.166} & \textbf{0.265} \\
    & 720 & 0.253 & 0.333 & 0.250 & 0.337 & 0.241 & 0.321 & \textbf{0.203} & \textbf{0.297} \\
    \cmidrule(lr){2-10}  
    & Avg & 0.212 & 0.300 & 0.203 & 0.397 & 0.190 & 0.282 & \textbf{0.162} & \textbf{0.256} \\
    \midrule
    \multirow{5}{*}{Traffic} 
    &  96 & 0.471 & 0.309 & \textbf{0.366} & \textbf{0.257} & 0.457 & 0.301 & 0.368 & 0.262 \\
    & 192 & 0.475 & 0.308 & \textbf{0.373} & 0.266 & 0.468 & 0.325 & \textbf{0.373} & \textbf{0.251} \\
    & 336 & 0.490 & 0.315 & 0.401 & 0.249 & 0.487 & 0.302 & \textbf{0.395} & \textbf{0.254} \\
    & 720 & 0.524 & 0.332 & 0.472 & 0.312 & 0.485 & 0.315 & \textbf{0.432} & \textbf{0.290} \\
    \cmidrule(lr){2-10}  
    & Avg & 0.490 & 0.316 & 0.403 & 0.271 & 0.474 & 0.311 & \textbf{0.392} & \textbf{0.264} \\
    \bottomrule
  \end{tabular}
  \end{small}
\end{table*}
```
```{=latex}
\newpage
```
```{=latex}
\begin{table*}[h]\caption{Full ablation studies on transfer to ETTh1 and ETTm1 for the cross-domain setting of forecasting. The standard deviations of SimMTM are within 0.005 for MSE and within 0.004 for MAE.}\label{tab:forecast_cross_domain_abs_full}
  \label{tab:lra}
  \vspace{-5pt}
  \vskip 0.15in
  \centering
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{9pt}
  \renewcommand\arraystretch{0.9}
  \begin{tabular}{c|c|ccccccccccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{Input-336} & \multicolumn{2}{c}{Supervised} & \multicolumn{2}{c}{W/o ${\cal L}_{\text{reconstruction}}$} & \multicolumn{2}{c}{W/o ${\cal L}_{\text{constraint}}$} & \multicolumn{2}{c}{{\textbf{SimMTM}}}  \\
    \cmidrule(lr){3-4} \cmidrule(lr){5-6}\cmidrule(lr){7-8} \cmidrule(lr){9-10}
    \multicolumn{2}{c}{Metric} & MSE & MAE & MSE & MAE & MSE  & MAE & MSE & MAE \\
    \toprule
    \multirow{5}{*}{\shortstack{ETTh2\\ $\downarrow$ \\ETTh1}}
    &  96 & 0.380 & 0.412 & 0.377 & \textbf{0.400} & 0.402 & 0.411 & \textbf{0.372} & 0.401 \\
    & 192 & 0.416 & 0.434 & 0.417 & 0.424 & 0.417 & \textbf{0.420} & \textbf{0.414} & 0.425 \\
    & 336 & 0.448 & 0.458 & 0.437 & 0.439 & 0.437 & \textbf{0.435} & \textbf{0.429} & 0.436 \\
    & 720 & 0.481 & 0.487 & 0.448 & 0.463 & 0.456 & 0.467 & \textbf{0.446} & \textbf{0.458} \\
    \cmidrule(lr){2-10} 
    & Avg & 0.431 & 0.448 & 0.420 & 0.432 & 0.423 & \textbf{0.430} & \textbf{0.415} & \textbf{0.430} \\
    \midrule
    \multirow{5}{*}{\shortstack{ETTm1\\ $\downarrow$ \\ETTh1}}  
    &  96 & 0.380 & 0.412 & 0.382 & \textbf{0.397} & 0.375 & 0.399 & \textbf{0.367} & 0.398 \\
    & 192 & 0.416 & 0.434 & 0.418 & \textbf{0.418} & 0.413 & 0.422 & \textbf{0.396} & 0.421 \\
    & 336 & 0.448 & 0.458 & 0.437 & \textbf{0.434} & \textbf{0.434} & 0.438 & 0.471 & 0.437 \\
    & 720 & 0.481 & 0.487 & 0.459 & 0.469 & 0.467 & 0.475 & \textbf{0.454} & \textbf{0.463} \\
    \cmidrule(lr){2-10} 
    & Avg & 0.431 & 0.448 & 0.424 & 0.430 & 0.422 & 0.434 & \textbf{0.422} & \textbf{0.430} \\
    \midrule
    \multirow{5}{*}{\shortstack{ETTm2\\ $\downarrow$ \\ETTh1}} 
    &  96 & \textbf{0.380} & \textbf{0.412} & 0.388 & 0.418 & 0.384 & 0.415 & 0.388 & 0.421 \\
    & 192 & \textbf{0.416} & 0.434 & 0.429 & 0.444 & 0.423 & 0.439 & 0.419 & \textbf{0.423} \\
    & 336 & 0.448 & 0.458 & 0.467 & 0.472 & 0.458 & 0.465 & \textbf{0.435} & \textbf{0.444} \\
    & 720 & 0.481 & 0.487 & 0.521 & 0.507 & 0.501 & 0.497 & \textbf{0.468} & \textbf{0.474} \\
    \cmidrule(lr){2-10} 
    & Avg & 0.431 & 0.448 & 0.451 & 0.460 & 0.441 & 0.454 & \textbf{0.428} & \textbf{0.441} \\
    \midrule
    \multirow{5}{*}{\shortstack{Weather\\ $\downarrow$ \\ETTh1}} 
    &  96 & \textbf{0.380} & 0.412 & 0.385 & \textbf{0.400} & 0.394 & 0.406 & 0.477 & 0.444 \\
    & 192 & \textbf{0.416} & 0.434 & 0.417 & 0.429 & 0.425 & \textbf{0.424} & 0.454 & 0.522 \\
    & 336 & 0.448 & 0.458 & 0.434 & \textbf{0.434} & 0.441 & 0.439 & \textbf{0.424} & \textbf{0.434} \\
    & 720 & 0.481 & 0.487 & \textbf{0.444} & \textbf{0.464} & 0.446 & 0.468 & 0.468 & 0.469 \\
    \cmidrule(lr){2-10} 
    & Avg & 0.431 & 0.448 & \textbf{0.420} & \textbf{0.432} & 0.427 & 0.434 & 0.456 & 0.467 \\
    \midrule
    \multirow{5}{*}{\shortstack{ETTh1\\ $\downarrow$ \\ETTm1}}
    &  96 & 0.295 & 0.346 & \textbf{0.286} & \textbf{0.341} & 0.290 & 0.346 & 0.290 & 0.348\\
    & 192 & 0.333 & 0.374 & \textbf{0.322} & \textbf{0.362} & 0.353 & 0.388 & 0.327 & 0.372 \\
    & 336 & 0.370 & 0.398 & 0.362 & 0.418 & 0.362 & 0.412 & \textbf{0.357} & \textbf{0.392} \\
    & 720 & 0.427 & 0.431 & 0.417 & 0.431 & 0.422 & 0.432 & \textbf{0.409} & \textbf{0.423} \\
    \cmidrule(lr){2-10} 
    & Avg & 0.356 & 0.387 & 0.347 & 0.388 & 0.357 & 0.395 & \textbf{0.346} & \textbf{0.384} \\
    \midrule
    \multirow{5}{*}{\shortstack{ETTh2\\ $\downarrow$ \\ETTm1}}  
    &  96 & \textbf{0.295} & \textbf{0.346} & 0.299 & 0.348 & 0.301 & 0.352 & 0.322 & 0.347 \\
    & 192 & 0.333 & 0.374 & \textbf{0.324} & 0.366 & 0.332 & \textbf{0.359} & 0.332 & 0.372 \\
    & 336 & \textbf{0.370} & 0.398 & 0.374 & 0.401 & 0.389 & \textbf{0.382} & 0.394 & 0.391 \\
    & 720 & 0.427 & 0.431 & 0.415 & \textbf{0.419} & 0.421 & 0.442 & \textbf{0.411} & 0.424 \\
    \cmidrule(lr){2-10} 
    & Avg & 0.356 & 0.387 & \textbf{0.353} & 0.386 & 0.361 & \textbf{0.384} & 0.365 & \textbf{0.384} \\
    \midrule
    \multirow{5}{*}{\shortstack{ETTm2\\ $\downarrow$ \\ETTm1}} 
    &  96 & 0.295 & 0.346 & 0.299 & 0.351 & \textbf{0.285} & \textbf{0.336} & 0.297 & 0.348 \\
    & 192 & 0.333 & 0.374 & 0.334 & 0.372 & 0.343 & \textbf{0.366} & \textbf{0.332} & 0.370 \\
    & 336 & 0.370 & 0.398 & 0.362 & \textbf{0.388} & \textbf{0.360} & 0.399 & 0.364 & 0.393 \\
    & 720 & 0.427 & 0.431 & 0.417 & 0.431 & 0.422 & 0.432 & \textbf{0.410} & \textbf{0.421} \\
    \cmidrule(lr){2-10} 
    & Avg & 0.356 & 0.387 & 0.353 & 0.386 & 0.353 & \textbf{0.383} & \textbf{0.351} & \textbf{0.383} \\
    \midrule
    \multirow{5}{*}{\shortstack{Weather\\ $\downarrow$ \\ETTm1}} 
    &  96 & 0.295 & \textbf{0.346} & 0.322 & 0.361 & 0.309 & 0.354 & \textbf{0.294} & 0.354 \\
    & 192 & 0.333 & 0.374 & 0.344 & 0.378 & 0.343 & 0.365 & \textbf{0.318} & \textbf{0.355} \\
    & 336 & 0.370 & 0.398 & 0.371 & 0.399 & 0.401 & 0.411 & \textbf{0.361} & \textbf{0.397} \\
    & 720 & 0.427 & 0.431 & 0.426 & \textbf{0.422} & \textbf{0.425} & 0.427 & 0.427 & 0.426 \\
    \cmidrule(lr){2-10} 
    & Avg & 0.356 & 0.387 & 0.366 & 0.390 & 0.370 & 0.389 & \textbf{0.350} & \textbf{0.383} \\
    \bottomrule
  \end{tabular}
  \end{small}
   %\vspace{-50pt}
\end{table*}
```
```{=latex}
\newpage
```
```{=latex}
\centering
```
```{=latex}
\begin{threeparttable}
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{0.3pt}
  \begin{tabular}{cc|cccccccccccccccccc}
    \toprule
    \multicolumn{2}{c}{\multirow{2}{*}{\scalebox{0.8}{Models}}} & \multicolumn{4}{c}{\scalebox{0.8}{Transformer \cite{vaswani2017attention-Transformer}}} & \multicolumn{4}{c}{\scalebox{0.8}{Autoformer \cite{wu2021-autoformer}}}  & \multicolumn{4}{c}{\scalebox{0.8}{Ns Transformer \cite{liunon-NS}}} & \multicolumn{6}{c}{\scalebox{0.8}{PatchTST \cite{nie2022time-patch}}} \\ 
    \cmidrule(lr){3-6} \cmidrule(lr){7-10} \cmidrule(lr){11-14} \cmidrule(lr){15-20} 
    & & \multicolumn{2}{c}{\scalebox{0.8}{Random init.}} & \multicolumn{2}{c}{\textbf{\scalebox{0.8}{+SimMTM}}} &  \multicolumn{2}{c}{\scalebox{0.8}{Random init.}} & \multicolumn{2}{c}{\textbf{\scalebox{0.8}{+SimMTM}}}  & \multicolumn{2}{c}{\scalebox{0.8}{Random init.}} & \multicolumn{2}{c}{\textbf{\scalebox{0.8}{+SimMTM}}}  & \multicolumn{2}{c}{\scalebox{0.8}{Random init.}} & \multicolumn{2}{c}{\scalebox{0.8}{+Sub-serie Masking}} & \multicolumn{2}{c}{\textbf{\scalebox{0.8}{+SimMTM}}} \\
    \cmidrule(lr){3-20}
    \multicolumn{2}{c}{\scalebox{0.8}{Metric}} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} & \scalebox{0.8}{MSE} & \scalebox{0.8}{MAE} \\
    \toprule
    \multirow{5}{*}{\scalebox{0.9}{\rotatebox{90}{ETTh1}}}
    &  \scalebox{0.8}{96} & \scalebox{0.8}{0.847} & \scalebox{0.8}{0.731} & \scalebox{0.8}{0.775} & \scalebox{0.8}{0.691} & \scalebox{0.8}{0.536} & \scalebox{0.8}{0.548} & \scalebox{0.8}{0.526} & \scalebox{0.8}{0.536} & \scalebox{0.8}{0.513} & \scalebox{0.8}{0.491} & \scalebox{0.8}{0.490} & \scalebox{0.8}{0.489} & \scalebox{0.8}{0.375} & \scalebox{0.8}{0.399} & \scalebox{0.8}{0.366} & \scalebox{0.8}{0.398} & \scalebox{0.8}{0.373} & \scalebox{0.8}{0.399} \\
    & \scalebox{0.8}{192} & \scalebox{0.8}{1.084} & \scalebox{0.8}{0.841} & \scalebox{0.8}{0.918} & \scalebox{0.8}{0.763} & \scalebox{0.8}{0.543} & \scalebox{0.8}{0.551} & \scalebox{0.8}{0.523} & \scalebox{0.8}{0.548} & \scalebox{0.8}{0.534} & \scalebox{0.8}{0.504} & \scalebox{0.8}{0.517} & \scalebox{0.8}{0.499} & \scalebox{0.8}{0.414} & \scalebox{0.8}{0.421} & \textcolor{gray}{\scalebox{0.8}{0.431}} & \textcolor{gray}{\scalebox{0.8}{0.443}} & \scalebox{0.8}{0.406} & \scalebox{0.8}{0.428} \\
    & \scalebox{0.8}{336} & \scalebox{0.8}{1.350} & \scalebox{0.8}{0.956} & \scalebox{0.8}{1.079} & \scalebox{0.8}{0.845} & \scalebox{0.8}{0.615} & \scalebox{0.8}{0.592} & \scalebox{0.8}{0.595} & \scalebox{0.8}{0.591} & \scalebox{0.8}{0.588} & \scalebox{0.8}{0.535} & \scalebox{0.8}{0.552} & \scalebox{0.8}{0.520} & \scalebox{0.8}{0.431} & \scalebox{0.8}{0.436} & \textcolor{gray}{\scalebox{0.8}{0.450$\downarrow$}} & \textcolor{gray}{\scalebox{0.8}{0.456$\downarrow$}} & \scalebox{0.8}{0.422} & \scalebox{0.8}{0.431} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{1.069} & \scalebox{0.8}{0.817} & \scalebox{0.8}{0.935} & \scalebox{0.8}{0.761} & \scalebox{0.8}{0.599} & \scalebox{0.8}{0.600} & \scalebox{0.8}{0.600} & \scalebox{0.8}{0.597} & \scalebox{0.8}{0.643} & \scalebox{0.8}{0.616} & \scalebox{0.8}{0.614} & \scalebox{0.8}{0.598} & \scalebox{0.8}{0.449} & \scalebox{0.8}{0.466} & \textcolor{gray}{\scalebox{0.8}{0.472$\downarrow$}} & \textcolor{gray}{\scalebox{0.8}{0.484$\downarrow$}} & \scalebox{0.8}{0.436} & \scalebox{0.8}{0.452} \\
    \cmidrule(lr){2-20}
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{1.088} & \scalebox{0.8}{0.836} & \scalebox{0.8}{\textbf{0.927}} & \scalebox{0.8}{\textbf{0.761}} & \scalebox{0.8}{0.573} & \scalebox{0.8}{0.573} & \scalebox{0.8}{\textbf{0.561}} & \scalebox{0.8}{\textbf{0.568}} & \scalebox{0.8}{0.570} & \scalebox{0.8}{0.537} & \scalebox{0.8}{\textbf{0.543}} & \scalebox{0.8}{\textbf{0.527}} & \scalebox{0.8}{0.417} & \scalebox{0.8}{0.431} & \textcolor{gray}{\scalebox{0.8}{0.430$\downarrow$}} & \textcolor{gray}{\scalebox{0.8}{0.445$\downarrow$}} & \scalebox{0.8}{\textbf{0.409}} & \scalebox{0.8}{\textbf{0.428}} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{\rotatebox{90}{ETTh2}}}  
    &  \scalebox{0.8}{96} & \scalebox{0.8}{ 2.029 } & \scalebox{0.8}{ 1.150 } & \scalebox{0.8}{ 1.879 } & \scalebox{0.8}{ 1.104 } & \scalebox{0.8}{ 0.492 } & \scalebox{0.8}{ 0.517 } & \scalebox{0.8}{ 0.488 } & \scalebox{0.8}{ 0.514 } & \scalebox{0.8}{ 0.476 } & \scalebox{0.8}{ 0.458 } & \scalebox{0.8}{ 0.445 } & \scalebox{0.8}{ 0.448 } & \scalebox{0.8}{ 0.274 } & \scalebox{0.8}{ 0.336 } & \scalebox{0.8}{ \textcolor{gray}{0.284$\downarrow$} } & \scalebox{0.8}{ \textcolor{gray}{0.343$\downarrow$} } & \scalebox{0.8}{ 0.274 } & \scalebox{0.8}{ 0.337} \\
    & \scalebox{0.8}{192} & \scalebox{0.8}{ 6.785 } & \scalebox{0.8}{ 2.099 } & \scalebox{0.8}{ 5.054 } & \scalebox{0.8}{ 1.771 } & \scalebox{0.8}{ 0.556 } & \scalebox{0.8}{ 0.551 } & \scalebox{0.8}{ 0.547 } & \scalebox{0.8}{ 0.549 } & \scalebox{0.8}{ 0.512 } & \scalebox{0.8}{ 0.493 } & \scalebox{0.8}{ 0.482 } & \scalebox{0.8}{ 0.502 } & \scalebox{0.8}{ 0.339 } & \scalebox{0.8}{ 0.379 } & \scalebox{0.8}{ \textcolor{gray}{0.355$\downarrow$} } & \scalebox{0.8}{ \textcolor{gray}{0.387$\downarrow$} } & \scalebox{0.8}{ 0.339 } & \scalebox{0.8}{ 0.377} \\
    & \scalebox{0.8}{336} & \scalebox{0.8}{ 4.568 } & \scalebox{0.8}{ 1.711 } & \scalebox{0.8}{ 4.242 } & \scalebox{0.8}{ 1.658 } & \scalebox{0.8}{ 0.572 } & \scalebox{0.8}{ 0.578 } & \scalebox{0.8}{ 0.563 } & \scalebox{0.8}{ 0.570 } & \scalebox{0.8}{ 0.552 } & \scalebox{0.8}{ 0.551 } & \scalebox{0.8}{ 0.512 } & \scalebox{0.8}{ 0.537 } & \scalebox{0.8}{ 0.331 } & \scalebox{0.8}{ 0.380 } & \scalebox{0.8}{ \textcolor{gray}{0.379$\downarrow$} } & \scalebox{0.8}{ \textcolor{gray}{0.411$\downarrow$} } & \scalebox{0.8}{ 0.327 } & \scalebox{0.8}{ 0.381} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{ 3.030 } & \scalebox{0.8}{ 1.486 } & \scalebox{0.8}{ 2.815 } & \scalebox{0.8}{ 1.413 } & \scalebox{0.8}{ 0.580 } & \scalebox{0.8}{ 0.588 } & \scalebox{0.8}{ 0.575 } & \scalebox{0.8}{ 0.588 } & \scalebox{0.8}{ 0.562 } & \scalebox{0.8}{ 0.560 } & \scalebox{0.8}{ 0.531 } & \scalebox{0.8}{ 0.568 } & \scalebox{0.8}{ 0.379 } & \scalebox{0.8}{ 0.422 } & \scalebox{0.8}{ \textcolor{gray}{0.400$\downarrow$} } & \scalebox{0.8}{ \textcolor{gray}{0.435$\downarrow$} } & \scalebox{0.8}{ 0.375 } & \scalebox{0.8}{ 0.423} \\
    \cmidrule(lr){2-20}
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{  4.103 } & \scalebox{0.8}{  1.612 } & \scalebox{0.8}{  \textbf{3.498} } & \scalebox{0.8}{  \textbf{1.487} } & \scalebox{0.8}{  0.550 } & \scalebox{0.8}{  0.559 } & \scalebox{0.8}{  \textbf{0.543} } & \scalebox{0.8}{  \textbf{0.555} } & \scalebox{0.8}{  0.526 } & \scalebox{0.8}{ 0.516 } & \scalebox{0.8}{  \textbf{0.493} } & \scalebox{0.8}{  \textbf{0.514} } & \scalebox{0.8}{  0.331 } & \scalebox{0.8}{  0.379 } & \scalebox{0.8}{  \textcolor{gray}{0.355$\downarrow$}} & \scalebox{0.8}{  \textcolor{gray}{0.394$\downarrow$}} & \scalebox{0.8}{  \textbf{0.329} } & \scalebox{0.8}{ 0.379} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{\rotatebox{90}{ETTm1}}} 
    &  \scalebox{0.8}{96} & \scalebox{0.8}{  0.562 } & \scalebox{0.8}{  0.520 } & \scalebox{0.8}{  0.513 } & \scalebox{0.8}{  0.497 } & \scalebox{0.8}{  0.523 } & \scalebox{0.8}{  0.488 } & \scalebox{0.8}{  0.482 } & \scalebox{0.8}{  0.465 } & \scalebox{0.8}{  0.386 } & \scalebox{0.8}{  0.398 } & \scalebox{0.8}{  0.340 } & \scalebox{0.8}{  0.376 } & \scalebox{0.8}{  0.290 } & \scalebox{0.8}{  0.342 } & \scalebox{0.8}{  \textcolor{gray}{0.289$\downarrow$} } & \scalebox{0.8}{  \textcolor{gray}{0.344$\downarrow$} } & \scalebox{0.8}{  0.288 } & \scalebox{0.8}{  0.343} \\
    & \scalebox{0.8}{192} & \scalebox{0.8}{  0.810 } & \scalebox{0.8}{  0.668 } & \scalebox{0.8}{  0.686 } & \scalebox{0.8}{  0.606 } & \scalebox{0.8}{  0.543 } & \scalebox{0.8}{  0.498 } & \scalebox{0.8}{  0.499 } & \scalebox{0.8}{  0.476 } & \scalebox{0.8}{  0.459 } & \scalebox{0.8}{  0.444 } & \scalebox{0.8}{  0.423 } & \scalebox{0.8}{  0.445 } & \scalebox{0.8}{  0.332 } & \scalebox{0.8}{  0.369 } & \scalebox{0.8}{  0.323 } & \scalebox{0.8}{  0.368 } & \scalebox{0.8}{  0.329 } & \scalebox{0.8}{  0.367} \\
    & \scalebox{0.8}{336} & \scalebox{0.8}{  1.096 } & \scalebox{0.8}{  0.814 } & \scalebox{0.8}{  1.003 } & \scalebox{0.8}{  0.760 } & \scalebox{0.8}{  0.675 } & \scalebox{0.8}{  0.551 } & \scalebox{0.8}{  0.601 } & \scalebox{0.8}{  0.524 } & \scalebox{0.8}{  0.495 } & \scalebox{0.8}{  0.464 } & \scalebox{0.8}{  0.423 } & \scalebox{0.8}{  0.459 } & \scalebox{0.8}{  0.366 } & \scalebox{0.8}{  0.392 } & \scalebox{0.8}{  0.353 } & \scalebox{0.8}{  0.387 } & \scalebox{0.8}{  0.361 } & \scalebox{0.8}{  0.387} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{  1.136 } & \scalebox{0.8}{  0.813 } & \scalebox{0.8}{  1.032 } & \scalebox{0.8}{  0.790 } & \scalebox{0.8}{  0.720 } & \scalebox{0.8}{  0.528 } & \scalebox{0.8}{  0.629 } & \scalebox{0.8}{  0.555 } & \scalebox{0.8}{  0.585 } & \scalebox{0.8}{  0.516 } & \scalebox{0.8}{  0.539 } & \scalebox{0.8}{  0.499 } & \scalebox{0.8}{  0.420 } & \scalebox{0.8}{  0.424 } & \scalebox{0.8}{  0.398 } & \scalebox{0.8}{  0.416 } & \scalebox{0.8}{  0.413 } & \scalebox{0.8}{  0.417} \\
    \cmidrule(lr){2-20}
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{  0.901 } & \scalebox{0.8}{  0.704 } & \scalebox{0.8}{   \textbf{0.809} } & \scalebox{0.8}{      \textbf{0.663} } & \scalebox{0.8}{  0.615 } & \scalebox{0.8}{  0.528 } & \scalebox{0.8}{    \textbf{0.553} } & \scalebox{0.8}{      \textbf{0.505} } & \scalebox{0.8}{  0.481 } & \scalebox{0.8}{  0.456 } & \scalebox{0.8}{    \textbf{0.431} } & \scalebox{0.8}{      \textbf{0.445} } & \scalebox{0.8}{  0.352 } & \scalebox{0.8}{  0.382 } & \scalebox{0.8}{    \textbf{0.341} } & \scalebox{0.8}{  0.379 } & \scalebox{0.8}{  0.348 } & \scalebox{0.8}{    \textbf{0.378}} \\
    \midrule
    \multirow{5}{*}{\scalebox{0.8}{\rotatebox{90}{ETTm2}}}
    &  \scalebox{0.8}{96} & \scalebox{0.8}{  0.508 } & \scalebox{0.8}{  0.539 } & \scalebox{0.8}{  0.336 } & \scalebox{0.8}{  0.425 } & \scalebox{0.8}{  0.255 } & \scalebox{0.8}{  0.339 } & \scalebox{0.8}{  0.255 } & \scalebox{0.8}{  0.340 } & \scalebox{0.8}{  0.192 } & \scalebox{0.8}{  0.274 } & \scalebox{0.8}{  0.188 } & \scalebox{0.8}{  0.277 } & \scalebox{0.8}{  0.165 } & \scalebox{0.8}{  0.255 } & \scalebox{0.8}{  \textcolor{gray}{0.166$\downarrow$} } & \scalebox{0.8}{  \textcolor{gray}{0.256$\downarrow$} } & \scalebox{0.8}{  0.163 } & \scalebox{0.8}{  0.253} \\
    & \scalebox{0.8}{192} & \scalebox{0.8}{  0.972 } & \scalebox{0.8}{  0.721 } & \scalebox{0.8}{  0.713 } & \scalebox{0.8}{  0.610 } & \scalebox{0.8}{  0.281 } & \scalebox{0.8}{  0.340 } & \scalebox{0.8}{  0.276 } & \scalebox{0.8}{  0.332 } & \scalebox{0.8}{  0.280 } & \scalebox{0.8}{  0.339 } & \scalebox{0.8}{  0.277 } & \scalebox{0.8}{  0.336 } & \scalebox{0.8}{  0.220 } & \scalebox{0.8}{  0.292 } & \scalebox{0.8}{  \textcolor{gray}{0.221$\downarrow$} } & \scalebox{0.8}{  \textcolor{gray}{0.295$\downarrow$} } & \scalebox{0.8}{  0.219 } & \scalebox{0.8}{  0.292} \\
    & \scalebox{0.8}{336} & \scalebox{0.8}{  1.419 } & \scalebox{0.8}{  0.897 } & \scalebox{0.8}{  1.517 } & \scalebox{0.8}{  0.942 } & \scalebox{0.8}{  0.339 } & \scalebox{0.8}{  0.372 } & \scalebox{0.8}{  0.309 } & \scalebox{0.8}{  0.359 } & \scalebox{0.8}{  0.334 } & \scalebox{0.8}{  0.361 } & \scalebox{0.8}{  0.325 } & \scalebox{0.8}{  0.355 } & \scalebox{0.8}{  0.278 } & \scalebox{0.8}{  0.329 } & \scalebox{0.8}{  0.278 } & \scalebox{0.8}{  \textcolor{gray}{0.333$\downarrow$} } & \scalebox{0.8}{  0.275 } & \scalebox{0.8}{  0.328} \\
    & \scalebox{0.8}{720} & \scalebox{0.8}{  3.598 } & \scalebox{0.8}{  1.445 } & \scalebox{0.8}{  2.720 } & \scalebox{0.8}{  1.254 } & \scalebox{0.8}{  0.422 } & \scalebox{0.8}{  0.419 } & \scalebox{0.8}{  0.420 } & \scalebox{0.8}{  0.410 } & \scalebox{0.8}{  0.417 } & \scalebox{0.8}{  0.413 } & \scalebox{0.8}{  0.414 } & \scalebox{0.8}{  0.412 } & \scalebox{0.8}{  0.367 } & \scalebox{0.8}{  0.385 } & \scalebox{0.8}{  0.365 } & \scalebox{0.8}{ \textcolor{gray}{0.388$\downarrow$} } & \scalebox{0.8}{  0.359 } & \scalebox{0.8}{  0.381} \\
    \cmidrule(lr){2-20}
    & \scalebox{0.8}{Avg} & \scalebox{0.8}{  1.624 } & \scalebox{0.8}{  0.901 } & \scalebox{0.8}{   \textbf{1.322} } & \scalebox{0.8}{      \textbf{0.808} } & \scalebox{0.8}{  0.324 } & \scalebox{0.8}{  0.368 } & \scalebox{0.8}{    \textbf{0.315} } & \scalebox{0.8}{      \textbf{0.360} } & \scalebox{0.8}{  0.306 } & \scalebox{0.8}{  0.347 } & \scalebox{0.8}{    \textbf{0.301} } & \scalebox{0.8}{      \textbf{0.345} } & \scalebox{0.8}{  0.258 } & \scalebox{0.8}{  0.317 } & \scalebox{0.8}{  0.258 } & \scalebox{0.8}{  \textcolor{gray}{0.318$\downarrow$} } & \scalebox{0.8}{    \textbf{0.254} } & \scalebox{0.8}{      \textbf{0.313}} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \end{threeparttable}
```
```{=latex}
\vspace{-10pt}
```
```{=latex}
\vspace{55pt}
```
```{=latex}
\begin{table*}[h]\caption{Full results for fine-tuning to limited data scenarios. We fine-tune the model pre-trained from ETTh2 to ETTh1 with different data proportions $\{10\%, 25\%, 50\%, 75\%, 100\%\}$.}
  \label{tab:full_data_limited}
  \vspace{5pt}
  \centering
  \begin{small}
  \renewcommand{\multirowsetup}{\centering}
  \setlength{\tabcolsep}{1.5pt}
  \renewcommand\arraystretch{1.2}
  \begin{tabular}{cccc|cc|cccccccccccc}
    \toprule
    \multicolumn{2}{c}{\scalebox{0.9}{Models}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{\textbf{SimMTM}}}} & \multicolumn{2}{c}{\scalebox{0.9}{Random init.}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{Ti-MAE}\cite{li2023ti-TiMAE}}} &
    \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TST}\cite{zerveas2021transformer-TST}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{LaST}\cite{wang2022learning-LaST}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TF-C}\cite{Zhang2022-TF-C}}} & \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{CoST}\cite{woo2022cost-CoST}}} &  \multicolumn{2}{c}{\rotatebox{0}{\scalebox{0.9}{TS2Vec}\cite{Yue2022-TS2Vec}}} \\
    
    \cmidrule(lr){3-18}
    \multicolumn{2}{c}{\scalebox{0.9}{Metric}} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MAE} & \scalebox{0.9}{MSE} \\
    \toprule
    %\multirow{4}{*}{ETTh1} & 12.5\% & 1.419 & 0.912 & 1.278 & 0.847 & 1.300 & 0.852 & 1.232 & 0.824 & \textbf{0.644} & \textbf{0.538} & 0.684 & 0.550 \\
    \multirow{5}{*}{\scalebox{0.9}{\shortstack{ETTh2\\ $\downarrow$ \\ETTh1}}}
    & \scalebox{0.9}{10\%} & \scalebox{0.9}{\textbf{0.591}} & \scalebox{0.9}{\textbf{0.523}} & \scalebox{0.9}{0.653} & \scalebox{0.9}{0.558} & \scalebox{0.9}{0.660} & \scalebox{0.9}{0.517} & \scalebox{0.9}{0.783} & \scalebox{0.9}{0.588} & \scalebox{0.9}{0.645} & \scalebox{0.9}{0.507} & \scalebox{0.9}{0.799} & \scalebox{0.9}{0.783} & \scalebox{0.9}{0.784} & \scalebox{0.9}{0.604} & \scalebox{0.9}{0.655} & \scalebox{0.9}{0.550} \\
    & \scalebox{0.9}{25\%} & \scalebox{0.9}{\textbf{0.535}} & \scalebox{0.9}{\textbf{0.490}} & \scalebox{0.9}{0.632} & \scalebox{0.9}{0.502} & \scalebox{0.9}{0.594} & \scalebox{0.9}{0.518} & \scalebox{0.9}{0.641} & \scalebox{0.9}{0.578} & \scalebox{0.9}{0.610} & \scalebox{0.9}{0.611} & \scalebox{0.9}{0.736} & \scalebox{0.9}{0.725} & \scalebox{0.9}{0.624} & \scalebox{0.9}{0.539} & \scalebox{0.9}{0.632} & \scalebox{0.9}{0.543} \\
    & \scalebox{0.9}{50\%} & \scalebox{0.9}{\textbf{0.491}} & \scalebox{0.9}{\textbf{0.473}} & \scalebox{0.9}{0.512} & \scalebox{0.9}{0.479} & \scalebox{0.9}{0.550} & \scalebox{0.9}{0.504} & \scalebox{0.9}{0.525} & \scalebox{0.9}{0.509} & \scalebox{0.9}{0.540} & \scalebox{0.9}{0.513} & \scalebox{0.9}{0.731} & \scalebox{0.9}{0.704} & \scalebox{0.9}{0.540} & \scalebox{0.9}{0.499} & \scalebox{0.9}{0.599} & \scalebox{0.9}{0.526} \\
    & \scalebox{0.9}{75\%} & \scalebox{0.9}{\textbf{0.466}} & \scalebox{0.9}{\textbf{0.458}} & \scalebox{0.9}{0.499} & \scalebox{0.9}{0.488} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.465} & \scalebox{0.9}{0.516} & \scalebox{0.9}{0.488} & \scalebox{0.9}{0.479} & \scalebox{0.9}{0.470} & \scalebox{0.9}{0.697} & \scalebox{0.9}{0.689} & \scalebox{0.9}{0.494} & \scalebox{0.9}{0.475} & \scalebox{0.9}{0.577} & \scalebox{0.9}{0.534} \\
    & 100\% & \scalebox{0.9}{\textbf{0.415}} & \scalebox{0.9}{\textbf{0.430}} & \scalebox{0.9}{0.431} & \scalebox{0.9}{0.448} & \scalebox{0.9}{0.466} & \scalebox{0.9}{0.456} & \scalebox{0.9}{0.469} & \scalebox{0.9}{0.459} & \scalebox{0.9}{0.443} & \scalebox{0.9}{0.471} & \scalebox{0.9}{0.635} & \scalebox{0.9}{0.634} & \scalebox{0.9}{0.428} & \scalebox{0.9}{0.433} & \scalebox{0.9}{0.517} & \scalebox{0.9}{0.486} \\
    \bottomrule
  \end{tabular}
  \end{small}
  \vspace{-5pt}
\end{table*}
```
```{=latex}
\newpage
```
```{=latex}
\begin{table*}[th]\caption{In- and cross-domain settings of classification, where \textbf{all the baselines are based on the encoder utilized in their original papers}. For in-domain setting, we pre-train and fine-tune on the same dataset: Epilepsy. For cross-domain setting, we pre-train the model on SleepEEG and then fine-tune it on different datasets: Epilepsy, FD-B, Gesture, and EMG.}
    \label{tab:classification_incrodomain_papermodel}
        \setlength{\tabcolsep}{7pt}
    \vskip 0.1in
    \centering
    \begin{small}
            \renewcommand{\multirowsetup}{\centering}
            \begin{tabular}{l|l|l|cccc|c}
                \toprule
                \multicolumn{2}{c}{Scenarios} &  Models & Accuracy (\%) & Precision (\%) & Recall (\%) & F1 (\%) & Avg (\%) \\
                \midrule
                \multirow{8}{*}{\rotatebox{90}{In-Domain}} & \multirow{8}{*}{\shortstack{Epilepsy\\ $\downarrow$ \\ Epilepsy}}
                & Random init. & 89.83 & 92.13 & 74.47 & 79.59 & 84.00 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 92.17 & 93.84 & 81.19 & 85.71 & 88.23 \\
                & & CoST\cite{woo2022cost-CoST}  & 88.07 & 91.58 & 66.05 & 69.11 & 78.70 \\
                & & LaST \cite{wang2022learning-LaST}  & 92.11 & 93.12 & 81.47 & 85.74 & 88.11 \\
                & & TST \cite{zerveas2021transformer-TST}   & 80.21 & 40.11 & 50.00 & 44.51 & 53.71 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 90.09 & 93.90 & 77.24 & 78.21 & 84.86 \\
                & & TF-C \cite{Zhang2022-TF-C} & 93.96 & 94.87 & 85.82 & 89.46 & 91.03 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM}  & \textbf{94.75} & \textbf{95.60} & \textbf{89.93} & \textbf{91.41} & \textbf{92.92} \\
                \midrule
                \multirow{38}{*}{\rotatebox{90}{Cross-Domain}} & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ Epilepsy}}
                & Random init. & 89.83 & 92.13 & 74.47 & 79.59 & 84.00 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 93.95 & 90.59 & 90.39 & 90.45 & 91.35\\
                & & CoST\cite{woo2022cost-CoST} & 88.40 & 88.20 & 72.34 & 76.88 & 81.45 \\
                & & LaST \cite{wang2022learning-LaST} & 86.46 & 90.77 & 66.35 & 70.67 & 78.56 \\
                & & TST \cite{zerveas2021transformer-TST} & 80.21 & 40.11 & 50.00 & 44.51 & 53.71 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 89.71 & 72.36 & 67.47 & 68.55 & 74.52 \\
                & & TF-C \cite{Zhang2022-TF-C} & 94.95 & \textbf{94.56} & 89.08 & 91.49 & 92.52 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{95.49} & 93.36 & \textbf{92.28} & \textbf{92.81} & \textbf{93.49} \\
                \cmidrule(lr){2-8}
                & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ FD-B}}
                & Random init. & 47.36 & 48.29 & 52.35 & 49.11 & 49.28 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 47.90 & 43.39 & 48.42 & 43.89 & 45.90\\
                & & CoST\cite{woo2022cost-CoST} & 47.06 & 38.79 & 38.42 & 34.79 & 39.76 \\
                & & LaST \cite{wang2022learning-LaST} & 46.67 & 43.90 & 47.71 & 45.17 & 45.86  \\
                & & TST \cite{zerveas2021transformer-TST} & 46.40 & 41.58 & 45.50 & 41.34 & 43.71 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 60.88 & 66.98 & 68.94 & 66.56 & 65.84 \\
                & & TF-C \cite{Zhang2022-TF-C} & 69.38 & \textbf{75.59} & 72.02 & 74.87 & 72.97 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{69.40} & 74.18 & \textbf{76.41} & \textbf{75.11} & \textbf{73.78} \\
                \cmidrule(lr){2-8}
                & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ Gesture}}
                & Random init. & 42.19 & 47.51 & 49.63 & 48.86 & 47.05 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 69.17 & 65.45 & 68.54 & 65.70 & 67.22 \\
                & & CoST\cite{woo2022cost-CoST} & 68.33 & 65.30 & 68.33 & 66.42 & 67.09 \\
                & & LaST \cite{wang2022learning-LaST} & 64.17 & 70.36 & 64.17 & 58.76 & 64.37 \\
                & & TST \cite{zerveas2021transformer-TST} & 69.17 & 66.60 & 69.17 & 66.01 & 67.74 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 71.88 & 70.35 & 76.75 & 68.37 & 71.84 \\
                & & TF-C \cite{Zhang2022-TF-C} & 76.42 & 77.31 & 74.29 & 75.72 & 75.94 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{80.00} & \textbf{79.03} & \textbf{80.00} & \textbf{78.67} & \textbf{79.43} \\
                \cmidrule(lr){2-8}
                & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ EMG}}
                & Random init. & 77.80 & 59.09 & 66.67 & 62.38 & 66.49 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 78.54 & 80.40 & 67.85 & 67.66 & 73.61\\
                & & CoST\cite{woo2022cost-CoST} & 53.65 & 49.07 & 42.10 & 35.27 & 45.02 \\
                & & LaST \cite{wang2022learning-LaST} & 66.34 & 79.34 & 63.33 & 72.55 & 70.39 \\
                & & TST \cite{zerveas2021transformer-TST} & 78.34 & 77.11 & 80.30 & 68.89 & 76.16 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 69.99 & 70.25 & 63.44 & 70.89 & 68.64 \\
                & & TF-C \cite{Zhang2022-TF-C} & 81.71 & 72.65 & 81.59 & 76.83 & 78.20 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{97.56} & \textbf{98.33} & \textbf{98.04} & \textbf{98.14} & \textbf{98.02} \\
                \bottomrule
            \end{tabular}
          \vspace{-5pt}
    \end{small}
\end{table*}
```
```{=latex}
\newpage
```
```{=latex}
\begin{table*}[h]\caption{In- and cross-domain settings of classification \textbf{based on the unified 1-D ResNet encoder}. For in-domain setting, we pre-train and fine-tune on the same dataset: Epilepsy. For cross-domain setting, we pre-train on SleepEEG and fine-tune on different domain datasets: Epilepsy, FD-B, Gesture, and EMG.}
    \label{tab:classification_incrodomain_resnets}
        \setlength{\tabcolsep}{7pt}
    \vskip 0.1in
    \centering
    \begin{small}
            \renewcommand{\multirowsetup}{\centering}
            \begin{tabular}{l|l|l|cccc|c}
                \toprule
                \multicolumn{2}{c}{Scenarios} &  Models & Accuracy (\%) & Precision (\%) & Recall (\%) & F1 (\%) & Avg (\%)\\
                \midrule
                \multirow{8}{*}{\rotatebox{90}{In-Domain}} & \multirow{8}{*}{\shortstack{Epilepsy\\ $\downarrow$ \\ Epilepsy}}
                & Random init. & 89.83 & 92.13 & 74.47 & 79.59 & 84.00 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 92.33 & 94.53 & 81.11 & 86.33 & 88.58 \\
                & & CoST\cite{woo2022cost-CoST} & 92.35 & 94.73 & 81.16 & 85.92 & 88.54 \\
                & & LaST \cite{wang2022learning-LaST} & - & - & - & - & - \\
                & & TST \cite{zerveas2021transformer-TST} & 80.89 & 90.38 & 51.73 & 48.01 & 67.75  \\
                & & Ti-MAE \cite{li2023ti-TiMAE}  & 80.34 & 90.16 & 50.33 & 45.20 & 66.51 \\
                & & TF-C \cite{Zhang2022-TF-C} & 93.96 & 94.87 & 85.82 & 89.46 & 91.03 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{94.75} & \textbf{95.60} & \textbf{89.93} & \textbf{91.41} & \textbf{92.92} \\
                \midrule
                \multirow{38}{*}{\rotatebox{90}{Cross-Domain}} & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ Epilepsy}}
                & Random init. & 89.83 & 92.13 & 74.47 & 79.59 & 84.00 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 94.46 & 91.99 & 90.28 & 91.10 & 91.95 \\
                & & CoST\cite{woo2022cost-CoST} & 93.66 & 91.39 & 88.08 & 89.60 & 90.68 \\
                & & LaST \cite{wang2022learning-LaST} & - & - & - & - & - \\
                & & TST \cite{zerveas2021transformer-TST} & 82.89 & 86.15 & 79.02 & 80.44  & 82.13 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 73.45 & 72.56 & 65.34 & 77.20 & 72.14\\
                & & TF-C \cite{Zhang2022-TF-C} & 94.95 & \textbf{94.56} & 89.08 & 91.49 & 92.52 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{95.49} & 93.36 & \textbf{92.28} & \textbf{92.81} & \textbf{93.49} \\
                \cmidrule(lr){2-8}
                & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ FD-B}}
                & Random init. & 47.36 & 48.29 & 52.35 & 49.11 & 49.28 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 60.74 & 59.60 & 64.27 & 61.07 & 61.42 \\
                & & CoST\cite{woo2022cost-CoST} & 54.82 & 51.92 & 63.30 & 54.34 & 56.09 \\
                & & LaST \cite{wang2022learning-LaST} & - & - & - & - & - \\
                & & TST \cite{zerveas2021transformer-TST} & 65.57 & 70.05 & 67.57 & 64.41 & 66.90 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 67.98 & 62.83 & 64.45 & 63.36 & 64.66 \\
                & & TF-C \cite{Zhang2022-TF-C} & 69.38 & \textbf{75.59} & 72.02 & 74.87 & 72.97 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{69.40} & 74.18 & \textbf{76.41} & \textbf{75.11} & \textbf{73.78} \\
                \cmidrule(lr){2-8}
                & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ Gesture}}
                & Random Init. & 42.19 & 47.51 & 49.63 & 48.86 & 47.05 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec} & 73.33 & 70.88 & 73.33 & 71.56 & 72.27 \\
                & & CoST\cite{woo2022cost-CoST} & 73.33 & 74.37 & 73.33 & 71.16 & 73.04 \\
                & & LaST \cite{wang2022learning-LaST} & - & - & - & - & - \\
                & & TST \cite{zerveas2021transformer-TST} & 75.12 & 76.05 & 67.74 & 73.24 & 73.04  \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 75.54 & 69.32 & 72.42 & 69.32 & 71.65 \\
                & & TF-C \cite{Zhang2022-TF-C} & 76.42 & 77.31 & 74.29 & 75.72 & 75.94 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{80.00} & \textbf{79.03} & \textbf{80.00} & \textbf{78.67} & \textbf{79.43} \\
                \cmidrule(lr){2-8}
                & \multirow{8}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\ EMG}}
                & Random init. & 77.80 & 59.09 & 66.67 & 62.38 & 66.49 \\
                \cmidrule(lr){3-8}
                & & TS2vec \cite{Yue2022-TS2Vec}& 80.92 & 69.63 & 67.65 & 67.90 & 71.52 \\
                & & CoST\cite{woo2022cost-CoST} & 73.17 & 70.47 & 69.84 & 70.00 & 70.87 \\
                & & LaST \cite{wang2022learning-LaST} & - & - & - & - & - \\
                & & TST \cite{zerveas2021transformer-TST} & 75.89 & 74.67 & 80.66 & 78.48 & 77.43 \\
                & & Ti-MAE \cite{li2023ti-TiMAE} & 63.52 & 67.77 & 70.55 & 58.32 & 65.04 \\
                & & TF-C \cite{Zhang2022-TF-C} & 81.71 & 72.65 & 81.59 & 76.83 & 78.20 \\
                \cmidrule(lr){3-8}
                & & \textbf{SimMTM} & \textbf{97.56} & \textbf{98.33} & \textbf{98.04} & \textbf{98.14} & \textbf{98.02} \\
                \bottomrule
            \end{tabular}
          \vspace{-5pt}
    \end{small}
\end{table*}
```
```{=latex}
\begin{table*}[h]\caption{Full ablation studies for in-domain and cross-domain settings of classification. Under the \emph{Avg} metric, the standard deviations of SimMTM are within 0.2\% for Epilepsy, within 0.5\% for FD-B, within 0.6\% for Gesture, and within 0.1\% for EMG.}
  \label{tab:full_ablation_classification_abs_full}
   %\vspace{-20pt}
   \vskip 0.15in
   \centering
   \begin{small}
   \renewcommand{\multirowsetup}{\centering}
   \setlength{\tabcolsep}{8pt}
   \renewcommand\arraystretch{1.2}
   \begin{tabular}{c|l|cccc|c}
     \toprule
      \multicolumn{2}{c}{Scenarios} & Accuracy (\%) & Precision (\%) & Recall (\%) & F1 (\%) & Avg (\%) \\
     \toprule
     \multirow{4}{*}{\shortstack{Epilepsy\\ $\downarrow$ \\Epilepsy}}
     &  Random init. & 89.83 & 92.13 & 74.47 & 79.59 & 84.00 \\
     & W/o ${\cal L}_{\text{reconstruction}}$ & 93.80 & \textbf{96.11} & 86.11 & 89.45 & 91.37 \\
     & W/o ${\cal L}_{\text{constraint}}$ & 90.99 & 92.81 & 79.86 & 84.13 & 86.95 \\
     \cmidrule(lr){2-7} 
     & \textbf{SimMTM} & \textbf{94.75} & 95.60 & \textbf{89.93} & \textbf{91.41} & \textbf{92.92} \\
     \midrule
     \multirow{4}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\Epilepsy}}
     &  Random init. & 89.83 & 92.13 & 74.47 & 79.59 & 84.00 \\
     & W/o ${\cal L}_{\text{reconstruction}}$ & 94.54 & \textbf{93.87} & 88.46 & 90.84 & 91.93 \\
     & W/o ${\cal L}_{\text{constraint}}$ & 91.73 & 90.57 & 82.21 & 85.53 & 87.51 \\
     \cmidrule(lr){2-7} 
     & \textbf{SimMTM} & \textbf{95.49} & 93.36 & \textbf{92.28} & \textbf{92.81} & \textbf{93.49} \\
     \midrule
     \multirow{4}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\FD-B}}  
     &  Random init. & 47.36 & 48.29 & 52.35 & 49.11 & 49.28 \\
     & W/o ${\cal L}_{\text{reconstruction}}$ & 66.11 & 67.97 & 74.70 & 70.01 & 69.70 \\
     & W/o ${\cal L}_{\text{constraint}}$ & 53.71 & 69.48 & 62.67 & 50.86 & 59.18 \\
     \cmidrule(lr){2-7} 
     & \textbf{SimMTM} & \textbf{69.40} & \textbf{74.18} & \textbf{76.41} & \textbf{75.11} & \textbf{73.78} \\
     \midrule
     \multirow{4}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\Gesture}} 
     &  Random init. & 42.19 & 47.51 & 49.63 & 48.86 & 47.05 \\
     & W/o ${\cal L}_{\text{reconstruction}}$ & 78.50 & 79.01 & 78.50 & 77.17 & 78.30 \\
     & W/o ${\cal L}_{\text{constraint}}$ & 76.67 & 74.91 & 76.67 & 74.80 & 75.76 \\
     \cmidrule(lr){2-7} 
     & \textbf{SimMTM} & \textbf{80.00} & \textbf{79.03} & \textbf{80.00} & \textbf{78.67} & \textbf{79.43} \\
     \midrule
     \multirow{4}{*}{\shortstack{SleepEEG\\ $\downarrow$ \\EMG}} 
     &  Random init. & 77.80 & 59.09 & 66.67 & 62.38 & 66.49 \\
     & W/o ${\cal L}_{\text{reconstruction}}$ & 90.24 & 94.20 & 78.04 & 81.53 & 86.00 \\
     & W/o ${\cal L}_{\text{constraint}}$ & 85.37 & 89.97 & 69.62 & 70.74 & 78.93 \\
     \cmidrule(lr){2-7} 
     & \textbf{SimMTM} & \textbf{97.56} & \textbf{98.33} & \textbf{98.04} & \textbf{98.14} & \textbf{98.02} \\
     \bottomrule
   \end{tabular}
   \end{small}
   \vspace{30pt}
\end{table*}
```

[^1]: Equal Contribution
