通过TensorFlow实现word embedding

Posted on 2019-10-21 Edited on 2020-01-04 In Python Valine:

word2vec的方法主要分为CBOW（Continuous Bag Of Words）和skip-gram（n-gram）两大类。

两种方法互为镜像。简单来说，CBOW是通过上下文预测中间值来进行训练的，skip-gram是通过中间值预测上下文来进行训练的。

这里，我们使用skip-gram的方法。

python脚本

IDE: jupyter notebook

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from sklearn.manifold import TSNE
import matplotlib.pyplot as plt

import collections
import math
import os
import random
import zipfile

import numpy as np
from six.moves import urllib
from six.moves import xrange
import tensorflow as tf

class BasicPatternEmbedding:
    def __init__(self):
        self.url = 'http://mattmahoney.net/dc/'
        self.data_index = 0
        
        self.vocabulary_size = 5000
        
        self.batch_size = 128
        self.embedding_size = 128  # Dimension of the embedding vector.
        self.skip_window = 1       # How many words to consider left and right.
        self.num_skips = 2         # How many times to reuse an input to generate a label.

        # We pick a random validation set to sample nearest neighbors. Here we limit the
        # validation samples to the words that have a low numeric ID, which by
        # construction are also the most frequent.
        self.valid_size = 16     # Random set of words to evaluate similarity on.
        self.valid_window = 100  # Only pick dev samples in the head of the distribution.
        # choose 16 numbers from 0 to 99 randomly
        self.valid_examples = np.random.choice(self.valid_window, self.valid_size, replace=False)
        self.num_sampled = 64    # Number of negative examples to sample.
        self.num_steps = 10001
        
        self.final_embedding = None
        
        self.graph = tf.Graph()
        
    # download and verify the dataset file
    def maybe_download(self, filename, expected_bytes):
        # If the dataset file is not under the current path, download it directly
        if not os.path.exists(filename):
            filename, _ = urllib.request.urlretrieve(self.url + filename, filename)
        # get dataset file infomationn
        statinfo = os.stat(filename)
        # verify file size
        if statinfo.st_size == expected_bytes:
            print('Found and verified', filename)
        else:
            print(statinfo.st_size)
            raise Exception(
                'Failed to verify ' + filename + '. Can you get to it with a browser?')
        return filename
    
    # read the data from zip into a list of strings
    def read_data(self, filename):
        with zipfile.ZipFile(filename) as f:
            # separate by default separators, that is, all null characters, including spaces, newlines (\n), tabs (\t), etc.
            data = tf.compat.as_str(f.read(f.namelist()[0])).split()
        return data
    
    # process raw inputs into a dataset
    def build_dataset(self, words):
        # add unknown words into count list
        count = [['UNK', -1]]
        # count the words list and add the pairs (word_name, number) into count list
        count.extend(collections.Counter(words).most_common(self.vocabulary_size - 1))
        dictionary = dict()
        # create a dictionary of the words with serial number
        for word, _ in count:
            dictionary[word] = len(dictionary)
        data = list()
        unk_count = 0
        # convert the word list into a number list, 0 for unknown words
        for word in words:
            if word in dictionary:
                index = dictionary[word]
            else:
                index = 0
                unk_count += 1
            data.append(index)
        # update the number of UNK
        count[0][1] = unk_count
        # generate a new dictionary by exchanging key and value
        reversed_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
        return data, count, dictionary, reversed_dictionary
    
    # function to generate a training batch for the skip-gram model
    def generate_batch(self, data):
        # make sure the data length is OK
        assert self.batch_size % self.num_skips == 0
        assert self.num_skips <= 2 * self.skip_window

        batch = np.ndarray(shape=(self.batch_size), dtype=np.int32)
        labels = np.ndarray(shape=(self.batch_size, 1), dtype=np.int32)
        span = 2 * self.skip_window + 1  # [ skip_window target skip_window ]
        # create a new double-ended queue to store the buffer
        buffer = collections.deque(maxlen=span)
        # data_index indicates the end point of the current window
        if self.data_index + span > len(data):
            data_index = 0
        buffer.extend(data[self.data_index:self.data_index + span])
        self.data_index += span
        for i in range(self.batch_size // self.num_skips):
            target = self.skip_window  # target label at the center of the buffer
            targets_to_avoid = [self.skip_window]
            # sample num_skips batches and labels, optimizable
            for j in range(self.num_skips):
                while target in targets_to_avoid:
                    target = random.randint(0, span - 1)
                # avoid sampling to the same target
                targets_to_avoid.append(target)
                # each batch item stands for input
                batch[i * self.num_skips + j] = buffer[self.skip_window]
                # each label item stands for ground truth
                labels[i * self.num_skips + j, 0] = buffer[target]
            if self.data_index == len(data):
                buffer[:] = data[:span]
                self.data_index = span
            else:
                buffer.append(data[self.data_index])
                self.data_index += 1
        # Backtrack a little bit to avoid skipping words in the end of a batch
        self.data_index = self.data_index - span
        return batch, labels
    
    def train(self, data, reverse_dictionary):
        with self.graph.as_default():
            train_inputs = tf.placeholder(tf.int32, shape=[self.batch_size])
            train_labels = tf.placeholder(tf.int32, shape=[self.batch_size, 1])
            valid_dataset = tf.constant(self.valid_examples, dtype=tf.int32)

            # Ops and variables pinned to the CPU
            with tf.device('/cpu:0'):
                # Look up embeddings for inputs.
                embeddings = tf.Variable(tf.random_uniform([self.vocabulary_size, self.embedding_size], -1.0, 1.0))
                # according to embeddings, the 128-dimensional vector corresponding to the input word(train inputs) was extracted
                embed = tf.nn.embedding_lookup(embeddings, train_inputs)

                # Construct the variables for the NCE loss
                nce_weights = tf.Variable(tf.truncated_normal([self.vocabulary_size, self.embedding_size], stddev=1.0 / math.sqrt(self.embedding_size)))
                nce_biases = tf.Variable(tf.zeros([self.vocabulary_size]))
            # Compute the average NCE loss for the batch.
            # tf.nce_loss automatically draws a new sample of the negative labels each
            # time we evaluate the loss.
            loss = tf.reduce_mean(
                tf.nn.nce_loss(weights=nce_weights,
                             biases=nce_biases,
                             labels=train_labels,
                             inputs=embed,
                             num_sampled=self.num_sampled,
                             num_classes=self.vocabulary_size))
            
            # Construct the SGD optimizer using a learning rate of 1.0.
            optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)

            # Compute the cosine similarity between minibatch examples and all embeddings.
            norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
            normalized_embeddings = embeddings / norm
            valid_embeddings = tf.nn.embedding_lookup(normalized_embeddings, valid_dataset)
            similarity = tf.matmul(valid_embeddings, normalized_embeddings, transpose_b=True)

            # Add variable initializer.
            init = tf.global_variables_initializer()
            
        with tf.Session(graph = self.graph) as session:
            init.run()
            average_loss = 0
            
            for step in xrange(self.num_steps):
                batch_inputs, batch_labels = self.generate_batch(data)
                feed_dict = {train_inputs: batch_inputs, train_labels: batch_labels}

                # we perform one update step by evaluating the optimizer op (including it
                # in the list of returned values for session.run()
                _, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)
                average_loss += loss_val
                
                if step % 2000 == 0:
                    if step > 0:
                        average_loss /= 2000
                    # the average loss is an estimate of the loss over the last 2000 batches.
                    print('Average loss at step ', step, ': ', average_loss)
                    average_loss = 0
                
                # output the most similar eight words to the screen
                if step % 10000 == 0:
                    sim = similarity.eval()
                    for i in xrange(self.valid_size):
                        valid_word = reverse_dictionary[self.valid_examples[i]]
                        top_k = 8  # number of nearest neighbors
                        nearest = (-sim[i, :]).argsort()[1:top_k + 1]
                        log_str = 'Nearest to %s:' % valid_word
                        for k in xrange(top_k):
                            close_word = reverse_dictionary[nearest[k]]
                            log_str = '%s %s,' % (log_str, close_word)
                        print(log_str)
                        
            self.final_embeddings = normalized_embeddings.eval()
            
    # visualize the embeddings
    def plot_with_labels(self, low_dim_embs, labels, filename='tsne.png'):
        assert low_dim_embs.shape[0] >= len(labels), 'More labels than embeddings'
        plt.figure(figsize=(18, 18))  # in inches
        for i, label in enumerate(labels):
            x, y = low_dim_embs[i, :]
            plt.scatter(x, y)
            plt.annotate(label,
                            xy=(x, y),
                            xytext=(5, 2),
                            textcoords='offset points',
                            ha='right',
                            va='bottom')
        plt.show()
        #plt.savefig(filename)

if __name__ == "__main__":
    try:
        bpe = BasicPatternEmbedding()
        filename = bpe.maybe_download('text8.zip',31344016)
        vocabulary = bpe.read_data(filename)
        print('Data size', len(vocabulary))
        print ('vocabulary:', vocabulary[:10])
        
        data, count, dictionary, reverse_dictionary = bpe.build_dataset(vocabulary)
        del vocabulary  # Hint to reduce memory.
        print('Most common words (+UNK)', count[:5])
        print('Sample data', data[:10], [reverse_dictionary[i] for i in data[:10]])
        
        batch, labels = bpe.generate_batch(data)
        for i in range(8):
            print(batch[i], reverse_dictionary[batch[i]], '->', labels[i, 0], reverse_dictionary[labels[i, 0]])
        print (dictionary['a'], dictionary['as'], dictionary['term'])
        
        bpe.train(data, reverse_dictionary)
        
        tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=5000, method='exact')
        plot_only = 300
        low_dim_embs = tsne.fit_transform(bpe.final_embeddings[:plot_only, :])

        labels = [reverse_dictionary[i] for i in xrange(plot_only)]
        bpe.plot_with_labels(low_dim_embs, labels)

    except ImportError:
        print('Please install sklearn, matplotlib, and scipy to show embeddings.')

Found and verified text8.zip
Data size 17005207
vocabulary: ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
Most common words (+UNK) [['UNK', 2735459], ('the', 1061396), ('of', 593677), ('and', 416629), ('one', 411764)]
Sample data [0, 3081, 12, 6, 195, 2, 3134, 46, 59, 156] ['UNK', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against']
3081 originated -> 12 as
3081 originated -> 0 UNK
12 as -> 6 a
12 as -> 3081 originated
6 a -> 195 term
6 a -> 12 as
195 term -> 2 of
195 term -> 6 a
6 12 195
Average loss at step  0 :  185.77481079101562
Nearest to it: confidence, doesn, theatre, came, gulf, cultural, sites, corps,
Nearest to use: buried, grave, observation, dust, batman, security, hungarian, opens,
Nearest to at: warrior, total, rivers, yards, reaction, extinction, exclusively, eu,
Nearest to if: emergency, present, developing, dates, life, for, pennsylvania, genesis,
Nearest to between: grant, execution, generally, power, official, interpreted, hiv, binary,
Nearest to people: unlikely, mainly, prussian, dedicated, shot, spending, dangerous, pick,
Nearest to states: forward, racing, begins, printed, follow, vacuum, study, mythology,
Nearest to by: rulers, protestant, marvel, republic, zero, letters, researchers, amiga,
Nearest to american: hit, stores, managed, practiced, intermediate, retrieved, moreover, unique,
Nearest to world: leadership, decay, culture, false, vii, et, dialogue, gave,
Nearest to but: denominations, passing, according, germans, medical, emperors, working, grant,
Nearest to an: removed, marxist, experts, ac, eugene, bones, tree, ne,
Nearest to were: coat, facing, grammar, storage, teach, covering, solomon, circuit,
Nearest to to: plant, supporting, pay, pp, shell, problem, acids, post,
Nearest to be: judah, photo, films, both, senate, woman, villages, eating,
Nearest to used: legislative, hero, private, organ, spaces, vice, top, trivia,
Average loss at step  2000 :  22.257665908694268
Average loss at step  4000 :  5.249317247629166
Average loss at step  6000 :  4.652066127896309
Average loss at step  8000 :  4.529780765414238
Average loss at step  10000 :  4.432040006399155
Nearest to it: he, came, votes, matters, doesn, whole, confidence, continues,
Nearest to use: alien, buried, security, hungarian, grave, dust, batman, amount,
Nearest to at: in, killed, appearance, extinction, mathbf, rivers, eu, pronunciation,
Nearest to if: life, molecules, emergency, dates, present, pennsylvania, for, rates,
Nearest to between: eight, execution, vs, of, hiv, grant, official, documentary,
Nearest to people: UNK, mainly, dedicated, selection, unlikely, shot, fact, dangerous,
Nearest to states: forward, racing, cover, arithmetic, study, vacuum, vs, begins,
Nearest to by: and, as, in, infant, co, manufacturer, with, campaign,
Nearest to american: hit, importance, austin, entry, depending, retrieved, vs, intermediate,
Nearest to world: culture, leadership, UNK, false, mathbf, skills, et, titled,
Nearest to but: and, medical, working, was, connecticut, vs, europeans, denominations,
Nearest to an: the, ac, plant, challenge, experts, necessary, lake, marxist,
Nearest to were: jpg, are, facing, covering, manual, circuit, opposite, test,
Nearest to to: ends, and, plant, in, office, into, supporting, agave,
Nearest to be: iso, shorter, judah, self, painter, also, dependent, assistance,
Nearest to used: opposition, hero, private, illinois, legislative, regime, breaking, repeated,

背景

柯里-霍华德同构，Curry-Howard Isomorphism，又称柯里-霍华德对应（Curry-Howard correspondence）是在计算机程序和数学证明之间的紧密联系；这种对应也叫做公式为类型对应或命题为类型对应。这是对形式逻辑系统和公式计算（computational calculus）之间符号的相似性的推广。命名来自它的两位发现者：美国数学家哈斯凯尔·柯里和逻辑学家威廉·阿尔文·霍瓦德。

同构对应

Curry-Howard 同构显示了推理系统和程序语言之间的相似性，在此框架下：

程序语言的语言构造同构为推理系统的推理规则
程序的类型同构为逻辑命题
闭合程序（不依赖环境的程序）可以同构为一条定理的证明过程，其类型就是一条定理
逻辑上下文同构为自由变量类型指派
Lambda 演算同构为 Gentzen 的自然演绎
- 函数调用就是蕴含消除
- 函数抽象就是蕴含介入
- 参数多态就是全称量化
- 模板类型就是谓词
- 结构类型就是合取
- 联合类型就是析取
- 收参数但不返回就是否定
- call/cc 就是双重否定消除
SK 组合子演算同构为直觉 Hilbert 推理系统
- S 和 K 就是演算系统的两条公理

Curry-Howard 同构与 Martin-Löf 类型论系统

这个框架里灵活性最高的是 Martin-Löf 的系统，两个高度抽象的算子—— $\prod$ 和 $\sum$ 进一步泛化了函数调用与合取，这使得它有极其恐怖的抽象能力。这个系统的推理规则是一下几条：

Introduction rule for $\prod$

$$\frac{\Gamma,x:A\vdash b:B}{\Gamma\vdash\lambda x.b:(\prod x:A)B}(\prod I)$$

Elimination rule for $\prod$

$$\frac{\Gamma\vdash f:(\prod x:A)B\quad\Gamma\vdash a:A}{\Gamma\vdash apply(f,a):B[a/x]}(\prod E)$$

Suppose $f = \lambda x.x$, then $apply(f,a)=(\lambda x.x)a$

Introduction rule for $\sum$

$$\frac{\Gamma\vdash a:A\quad\Gamma\vdash b:B[a/x]}{\Gamma\vdash\lt a,b\gt:(\sum x:A)B}(\sum I)$$

Elimination rule for $\sum$

$$\frac{\Gamma\vdash c:(\sum x:A)B\quad\Gamma,x:A,y:B\vdash d:C[\lt x,y\gt/z]}{\Gamma\vdash split(c,\lambda x.\lambda y.d):C[c/z]}(\sum E)$$

where: $split(\lt a,b\gt,\lambda x.\lambda y.d)=(\lambda x.\lambda y.d)(a)(b)$

reference
https://www.zhihu.com/question/22959608/answer/24770830
https://zh.wikipedia.org/zh-hans/柯里-霍华德同构
 http://www2.math.uu.se/~palmgren/tillog/klogik04-01eng.pdf

[Reading Notes]Customer Value Chain Analysis

Posted on 2019-09-27 Edited on 2020-01-04 In Reading Notes , Service Computing Valine:

overall

Customer Value Chain Analysis (CVCA) is an original methodological tool that enables design teams in the product definition phase to comprehensively identify pertinent stakeholders, their relationships with each other, and their role in the product’s life cycle.

method

CVCA Step 1: Determine the business model for the vending machine.
CVCA Step 2: Delineate pertinent parties involved with the vending machine’s life cycle.
CVCA Step 3: Determine how the vending machine’s customers are related to each other.
CVCA Step 4: Identify the value propositions of the vending machine’s customers and define the flows between them.
CVCA Step 5: Analyze the Customer Chain to determine the vending machine’s critical customers and their value propositions. The vending operator and the soft drink bottler (circled) were determined to be the critical customers to the vending machine manufacturer.

case study

Case study 1: electrocardiogram (EKG) machine

Case study 2: pacemaker alert system

Case study 3: donor-funded micro-irrigation pump

[Reading Notes]Service system fundamentals: Work system, value chain, and life cycle

Posted on 2019-09-23 Edited on 2020-01-04 In Reading Notes , Service Computing Valine:

This paper presents three interrelated frameworks as a first attempt to define the fundamentals of service systems.

The work system framework uses nine basic elements to provide a system-oriented view of any system that performs work within or across organizations. Service systems are work systems.
The service value chain framework augments the work system framework by introducing functions that are associated specifically with services. It presents a two-sided view of service processes based on the common observation that services are typically coproduced by service providers and customers.
The work system life cycle model looks at how work systems (including service systems) change and evolve over time. It treats the life cycle of a system as a set of iterations involving planned and unplanned change.

This paper uses two examples, one largely manual and one highly automated, to illustrate the potential usefulness of the three frameworks, which can be applied together to describe, analyze, and study how service systems are created, how they operate, and how they evolve through a combination of planned and unplanned change.

BPMN2.0入门到掌握，这一篇就够了

Posted on 2019-09-19 Edited on 2019-11-04 Valine:

BPMN2.0入门到精通，这一篇就够了

笔者默认看这篇文章的同学都是了解、知道什么是BPMN的，因此背景知识、历史发展什么的都直接略过，我们直切正题：BPMN中的五个基础元素类别。

流对象（Flow Objects）：流对象是定义业务流程的主要图形元素，主要有三种流对象
1. 事件（Events）
2. 活动（Activities）
3. 网关（Gateways）
数据（Data）：数据主要通过四种元素表示
1. 数据对象（Data Objects）
2. 数据输入（Data Inputs）
3. 数据输出（Data Outputs）
4. 数据存储（Data Stores）
连接对象（Connecting Objects）：流对象彼此互相连接或者连接到其他信息的方法主要有四种
1. 顺序流（Sequence Flows）
2. 信息流（Message Flows）
3. 协同（Associations）
4. 数据协同（Data Associations）
泳道（Swimlanes）：有两种方式通过泳道对主要的建模元素进行分组
1. 泳池：Pools
2. 泳道：Lanes
Artifacts：主要用来提供关于流程的额外信息。BPMN2.0定义两种标准Artifacts，但是建模者或者建模工具可以增加任意多Artifacts。（Artifacts，有的地方翻译成“工件”，但是感觉不管翻译成什么都不够传神，所以本文中就不翻译这个词了。）
1. 组：Group
2. 文本注释：Text Annotation

流对象（Flow Objects）

事件（Event）

元素 Element	描述 Description	符号 Notation
开始事件 Start	表示一个流程(Process)或一个编排(choreography)的开始
中间事件 Intermediate	发生在开始和结束事件之间，影响处理流程
结束事件 End	表示一个流程(Process)或一个编排(choreography)的结束
其他	开始事件和一些中间事件具有定义事件原因的“触发器”。结束事件可以定义作为序列流路径结束的“结果”。开始事件只能对触发器（“catch”）做出反应。结束事件只能创建（“抛出”）结果。中间事件可以捕获或抛出触发器。对于捕获的事件、触发器，标记未填充；对于抛出的触发器和结果，标记已填充。另外，在bpmn 1.1中用来中断活动的一些事件现在可以在不中断的模式下使用。这些事件的边界是虚线（见右图）。

活动（Activity）

元素 Element	描述 Description	符号 Notation
活动 Activity	活动是公司在流程中执行的工作的通用术语。活动可以是原子的或非原子的（聚合物）。作为流程模型一部分的活动类型有：子流程和任务，它们都是圆角矩形。活动用于标准流程Process和编排Choreography。
任务（原子） Task（atomic）	任务是包含在流程中的原子活动。任务是当流程中的工作无法分解为更精细的流程细节级别时使用。
编排任务 Choreography Task	表示一个或多个消息交换的集合。每个编排任务涉及两个参与者。
子流程 Sub-Process	子流程是包含在流程或编排中的复合活动。它是复合的，因为它可以通过一组子活动分解为更细粒度级别的流程或编排。子流程活动主要有以下四类	Collapsed Sub-ProcessExpanded Sub-ProcessCollapsed Sub- ChoreographyExpanded Sub-Choreography

网关（Gateway）

元素 Element	描述 Description	符号 Notation
网关 Gateway	网关用于顺序流程和编排中序列流的发散和收敛。因此，它将决定路径的分支、分叉、合并和连接。内部标记将指示行为控制的类型（见下边一行）。
网关控制类型 Gateway Control Type	网关菱形内的图标将指示流控制行为的类型。控制类型包括：•排他型exclusive决策和合并。排他型exclusive和基于事件event-based的网关都执行排他决策，合并排他可以使用或不使用“x”标记来显示。•基于事件event-based和基于并行事件parallel event-based的网关可以启动流程的新实例。•包容型inclusive网关决策和合并。•复杂型complex网关——复杂的条件和情况。•并行parallel网关分叉和连接。每种类型的控件都会影响传入和传出流。

数据（Data）

数据对象提供有关需要执行的活动和/或它们产生的内容的信息，数据对象可以表示单个数据对象或数据对象集合。数据输入和数据输出为流程提供相同的信息。

连接对象（Connecting Objects）

元素 Element	描述 Description	符号 Notation
顺序流 Sequence Flow	表示活动的执行顺序
信息流 Message Flow	表示两个参与者之间准备发送和接收的信息流
协同 Association	协同用于将信息和artifact与图形元素链接。如果有箭头，则表示流向（如数据）。

泳道（Swimlanes）

元素 Element	描述 Description	符号 Notation
泳池 Pool	泳池是协作中参与者的图形表示。它还充当一个“泳道”和一个图形容器，用于从其他池中分割一组活动，通常是在B2B环境中。泳池可以具有内部详细信息，以将要执行的进程的形式显示。或者一个泳池可能没有内部细节，也就是说，它可以是一个“黑匣子”。
泳道 Lane	lane是进程中的一个子分区，有时在泳池中，它将垂直或水平地扩展进程的整个长度。泳道用于组织和分类活动。

Artifacts

元素 Element	描述 Description	符号 Notation
组 Group	组是同一类别内的图形元素的组。这种类型的分组不影响组内的序列流。类别名称在关系图上显示为组标签。类别可用于文档或分析目的。组是可以在图表上直观显示对象类别的一种方式。
文本注释 Text Annotation	是一个帮助建模者给图形元素增加额外文本说明的机制。

总结

至此，你已经通过 20% 的时间了解了BPMN2.0 接近 80% 的内容。虽然BPMN底层语法以及结构还没有学习，但是这并不影响你已经可以通过BPMN2.0对你所在的业务进行详尽的描述！

WhatIs: Definition, Theorem, Lemma, Corollary, Proposition, Conjecture, Claim, Axiom, Postulate, Identity, and Paradox

Posted on 2019-09-02 Valine:

Definition（定义）

A precise and unambiguous description of the meaning of a mathematical term. It characterizes the meaning of a word by giving all the properties and only those properties that must be true.

对数学术语含义的精确而明确的描述。它通过给出一个词的所有性质，仅给出那些必须为真的性质，来表征该词的意义。

Theorem（定理）

A mathematical statement that is proved using rigorous mathematical reasoning. In a mathematical paper, the term theorem is often reserved for the most important results.

用严格的数学推理证明的数学陈述。在数学论文中，术语定理通常是为最重要的结果而保留的。

Lemma（引理）

A minor result whose sole purpose is to help in proving a theorem. It is a stepping stone on the path to proving a theorem. Very occasionally lemmas can take on a life of their own.

唯一目的是帮助证明定理的小结果。这是证明一个定理之路的踏脚石。极少情况下引理可以独立存在。

Corollary（推论）

A result in which the (usually short) proof relies heavily on a given theorem (we often say that “this is a corollary of Theorem A”).

证明（通常是简短的）很大程度上依赖于一个给定定理的结果（我们经常说“这是定理A的一个推论”）。

Proposition（命题）

A proved and often interesting result, but generally less important than a theorem.

一个被证明的，通常很有趣的结果，但一般没有定理重要。

Conjecture（推测，猜想）

A statement that is unproved, but is believed to be true.

Claim（断言）

An assertion that is then proved. It is often used like an informal lemma.

未经证实但被认为是真实的陈述。

Axiom/Postulate（公理/假定）

A statement that is assumed to be true without proof. These are the basic building blocks from which all theorems are proved.

没有证明，且假设为真的陈述。这些是证明所有定理的基本构造块。

Identity（恒等式）

A mathematical expression giving the equality of two (often variable) quantities.

两个（通常是可变的）量相等的数学表达式。

Paradox（悖论）

A statement that can be shown, using a given set of axioms and de nitions, to be both true and false. Paradoxes are often used to show the inconsistencies in an awed theory. The term paradox is often used informally to describe a surprising or counterintuitive result that follows from a given set of rules.

一种使用一组给定的公理和定义，既正确又错误的陈述。悖论经常被用来显示敬畏理论中的矛盾。“悖论”一词通常被非正式地用来描述从一组给定规则得出的令人惊讶或违反直觉的结果。

一个案例讲清楚所有服务盈利模式

Posted on 2019-08-28 Edited on 2019-11-04 Valine:

前言

互联网发展到今天，各种各样的互联网公司、互联网服务层出不穷，盈利方式更是”千奇百怪“，免费服务更是数不胜数，已然成为一种常态。

然而众所周知，无利不起早，是人就要吃饭。虽然我们享受着众多”免费“服务、点着各种满减外卖，但是这些提供服务的公司和平台却都活的好好的。他们的服务盈利模式都有哪些呢？

服务盈利模式

服务盈利模式总的来说分为两大类：内向盈利模式和外向盈利模式。

内向盈利模式：利润来自于服务提供者或消费者，依赖于在服务过程中产生或剩余的价值的盈利模式。
外向盈利模式：利润来自于无关自三方，依赖于服务存在的盈利模式。

根据变现对象的不同，内向盈利模式和外向盈利模式又可以分为以下几种子模式：

内向盈利模式
- 资产盈利模式：通过资产换取价值的盈利模式，如资产销售
- 服务盈利模式：依赖资产，通过提供服务换取价值的盈利模式，如收费服务、增值服务
- 平台盈利模式（内向型）：通过平台换取价值的盈利模式，如押金池、管理费
- 过程盈利模式（内向型）：依赖平台，通过服务过程产生额外价值的盈利模式，如佣金抽成
外向盈利模式
- 平台盈利模式（外向型）：依赖平台，从无关第三方获取价值的盈利方式，如金融运作
- 过程盈利模式（外向型）：依赖过程，从无关第三方获取价值的盈利方式，如广告
- 其他盈利模式：通过吸引投资、上市募集资金等手段获取价值的盈利方式，如外部投资

案例讲解

我们现在虚构一个案例来讲解一下这几种盈利模式：

假设现在是2000年，有一家公司叫阿里后妈，他们是一个传统的服饰售卖公司。

由于互联网兴起，他们为了能够在线上销售自己的衣服，搭建了一个线上购物网站，通过销售一些货品获得盈利，此时他们的盈利模式是资产盈利模式。

为了增加客户黏度，他们提出了会员制度，通过办理线上会员，客户可以享受优先发货、免费退换等增值服务。此举一出，客户疯狂办理会员，后妈发现，衣服可以不赚钱，就交个朋友，会员费就够自己发工资了，此时他们的盈利模式是服务盈利模式。

阿里后妈的平台上人越来越多，这使得一些友商蠢蠢欲动。他们纷纷找阿里后妈商谈业务，希望能把自己的商品也上架到阿里后妈的网站上，他们愿意按期交一些管理费。后妈发现，自己连衣服都不用卖了，收管理费就足够恰饭了。此时盈利模式是平台盈利模式（内向型）。

但是一段时间过后，因为交易量大涨，导致平台运营成本增加，后妈发现每个月收的那点管理费有点不够用了。而且不管卖的多卖的少，商家交的管理费都一样，这是不公平的。于是后妈开始在管理费之外，每一单交易都收取30%的抽成。于是又增加了一种盈利模式过程盈利模式（内向型）。

员工越来越多、工资越来越高，交易量增速却开始下降，后妈发现这样下去可能要恰不起饭了。但是目前30%的抽成已经够高了，管理费用签过合同短期内不能变，怎么增加收入呢？后妈想到平台上有很多的押金，以及大量交易时的中转滞留滞留的资金，从中拿一部分出来，在不影响平台正常运作的情况下，做一些收益稳定的投资。盈利模式增加了平台盈利模式（外向型）。

过了一段时间，某离职员工把公司私用用户资金的问题爆料了出来，金融运作的路子走不通了。后妈又想到，可以在用户交易的过程中增加广告，比如交易页面上增加一个广告条，按照广告的播放次数收费，不需要增加买家或者卖家的成本，可以直接从第三方获取收益，岂不美哉。盈利模式增加了一种过程盈利模式（外向型）。

公司越办越大，最终上市，大量资金涌入，各个机构的投资纷至沓来，这些钱和服务并没有关系，这部分盈利属于其他盈利模式。

总结

一家互联网公司同一时期并非只能存在一种盈利模式，通常情况下，是多种盈利模式共存的。

本问提出的划分方法，第一层是通过盈利来源来分（内部、外部），第二层是通过利润的产生对象来分（资产、服务、平台、过程），对于服务盈利模式进行了较好的抽象，对目前常见的盈利模式由较好的覆盖。

[Reading Note]构造类型论与计算机程序设计

Posted on 2019-08-21 Edited on 2020-02-23 Valine:

Taken from：
http://xueshu.baidu.com/usercenter/paper/show?paperid=baa0175c22e30823d9b1e14e01cf4141&site=xueshu_se&hitarticle=1

背景及简介

构造类型论为计算机科学家提供了一个框架, 以一种优雅和灵活的方式把逻辑和程序设计语言结合起来: 在同一形式系统中, 可以同时表达规约和(函数式语言)程序, 从证明规则可以导出正确的程序, 并验证程序具有某种性质, 从而在同一系统内完成程序的开发和验证。

构造类型轮的理论基础及相互关系

构造类型论三大理论基石

直觉类型论和构造数学：构造类型论的直接基础是Martin-Lof的直觉类型论，它为构造数学提供直觉解释。它是一个逻辑框架，可以表达和解释其他逻辑或理论，从他的规范化证明立即得出其所表达理论的规范化。与此作用相同的还有T.Coquand和Giared的构造演算。
lambda演算和函数式语言程序设计与实现。lambda演算是函数式语言理论的基础，是函数式程序设计语言的纯理论部分的范式(canonical form)，是由函数抽象和函数应用组成的系统，而这两个特点也是程序设计语言所具有的的共同之处。
证明论和Curry-Howard同态。证明轮中直觉注意逻辑的Gentzen自然演绎、一些自动演绎技术及定理证明技术是构造类型轮实现系统主要使用的技术。

相互关系

Curry-Howard同态是Martin-Lof直觉类型轮的基础：把命题解释称类型（propositions-a-types），或命题作为集合（propositions=as=sets）。利用命题与集合之间的等价，推导的规范化与计算机表示该推导的证明项的值对应。

直觉主义（构造）逻辑和有类型lambda演算在Curry-Howard同态下可以相互转化：直觉逻辑自然演绎的证明可用某种有类型lambda项表示，并且自然演绎的证明规范化与lambda演算的beta变换对应。

构造类型轮与程序设计的对应关系

构造数学和计算机科学有一些基本概念是共同的。Bishop认为构造数学可以作为计算机科学的灵感的重要来源。

构造性证明与计算机程序概念的关系

构造性的证明与计算机程序概念有密切关系：为构造地证明命题 $(\forall x \in A)(\exists y \in B)P(x,y)$ ，必须要给出函数f，使f应用于A中元素a时，产生B中元素b，使P(a,b)满足。如果P(a,b)描述了一个规约，则证明该命题的函数f就是满足该规约的一个程序。所以，可以吧构造证明本身看成是一个计算机程序，程序的计算过程与证明的规范化对应。正因为构造性证明的这一计算内容，可把类型轮用作一种程序设计语言，而且，由于程序是从他的说明的证明中得到的，类型轮还可以用作程序设计逻辑。

Martin-Lof直觉类型论和计算机程序概念的关系

类型轮不是基于谓词逻辑，它不再使用Tarski的真值语义，而是利用命题和集合之间的Curry-Howard同态——“命题作为集合”的直觉主义语义来解释逻辑常数。这里，命题被解释称一个集合，若他的元素则被解释称命题的一个证明。根据Kolmogorov对直觉命题的解释，还可以吧集合看成是问题的描述，尤其是把集合看成是程序要解决问题的规约时，集合的元素就是满足该规约的程序。

用类型轮进行程序构造的好处是，可以在同一形式系统中表达规约和程序。同时，因为可以从证明规则到处正确的程序，并验证程序具有的性质，所以程序开发的验证也是在同一系统用中完成的。

构造类型轮的一些实现

类型论的一个主要应用是作为变成逻辑，在其中能够从规约推导出程序。近年来，有几种类型轮的实现：

Conell大学的Nurpl
Edinburgh大学的LCF
INRIA的Coq
Edinburgh大学的LEGO
Goteborg的ALF

其中，Coq和LEGO是基于构造演算CoC，与Martin-Lof直觉类型论的作用类似，也是提供了一种逻辑框架，利用了把命题解释成集合的思想。两者的区别是：Martin-Lof是直谓的（predicative），而CoC是非直谓的（impredicative）。所谓非直谓的，就是指可以对所有命题的类型Prop进行全称量化来构造类型。因此Martin-Lof直觉类型论只能够结实谓词逻辑，不能解释二姐逻辑，而CoC则能够解释高阶逻辑。

直觉主义（构造）逻辑和经典逻辑

直觉注意逻辑更多的从证明论和模型论的角度展现逻辑：也就是说，它是一个构造的逻辑（constructive logic）。所谓构造性（constructivity）是指：与经典逻辑只关心公式的真值不同，构造逻辑关注的是实际的证明对象本身。“构造”可以指一个过程以及执行该过程的结果。

两者的主要不同在于语义基础不同。经典逻辑的基础是真值函数的语义：每个命题都为真或者假，这是Tarski语义的本质。而在直觉主义（构造）逻辑中，命题的定义就是把该命题的证明写下来，只有当存在一个与该命题对应的证明对象时，命题才为真，这是Brouwer-Heyting-Kolmogorov基于证明论语义的本质。因此从构造逻辑的角度来说，命题的真等价于命题的可证明性。其余区别见表：

Curry-Howard同态

Curry-Howard同态在有类型的lambda演算和直觉命题逻辑之间建立了密切的关系。

用 $t:\sigma$ 表示项t具有类型 $\sigma$ 。在有类型lambda演算中项的构造有三种：变量（用x，y，z等表示）；抽象（用 $\lambda x.t$ 表示）；应用（用tu表示）。构造项的规则用自然演绎的方式描述如下：

变量形成规则： $\frac{}{x:\sigma \to x:\sigma}$

抽象形成规则： $\frac{\Gamma,x:\sigma \to t:r}{\Gamma \mapsto (\lambda x.t):\sigma \to r}$

变量形成规则： $\frac{\Gamma \mapsto t:\sigma \to r \Delta \mapsto u:\sigma}{\Gamma,\Delta \mapsto (tu):\sigma}$

对上边的规则形式进行改造：

把其中的项都去掉
用蕴含符号 $\Rightarrow$ 取代函数符号 $\to$
用逻辑共识取代项

则可以得到直觉逻辑的自然演绎表示

公理： $\frac{}{A\mapsto A}$

引入规则： $\frac{\Gamma, A \mapsto B}{\Gamma \mapsto A\Rightarrow B}(\Rightarrow I)$

消除规则： $\frac{\Gamma \mapsto A\Rightarrow B\Delta \mapsto A}{\Gamma,\Delta \mapsto B}(\Rightarrow E)$

这三条规则分别是直觉逻辑的自然演绎系统的定理、蕴含引入和蕴含消除规则。这就是Curry=Howard同态。

总结：

lambda演算的项（或函数式语言程序）的类型与直觉逻辑的公式之间的对应：公式作为类型（formula-as-types）
lambda演算的项与逻辑的证明对应：变量与公理对应，抽象与 $\Rightarrow$ 引入规则对应，应用与 $\Rightarrow$ 消除规则对应。而lambda演算的项就是函数式语言的程序，这就是：证明作为程序（proofs-as-programs）
lambda演算与beta规约的过程与逻辑中的规范化过程对应，这就是计算作为规范化（computation-as-normalization）

Martin-Lof直觉类型论概述

对命题的直觉解释：命题作为集合

类型论不是基于谓词演算，它的逻辑常熟市通过命题和集合之间的Curry-Howard同态解释的：命题被解释称集合，集合的元素代表了该命题的证明。

1）逻辑蕴含： $A \supset B$

$A \supset B$ 的证明是一个函数（方法，证明）。对A的每一证明，给出B的一个证明。
$A \supset B$ 等价于 $A \to B$ ，是从A到B的函数的集合。
集合 $A \to B$ 中元素都是函数，形式为 $\lambda x.b$ ，其中 $b\in B$ ，并且b可能依赖于 $x \in A$ 。

2）逻辑合取： $A \wedge B$

$A \wedge B$ 的证明是一个有序组，其中第一分量是A的每一证明，第二分量是B的一个证明。
$A \wedge B$ 等价于 $A \times B$ ，是A与B的笛卡尔积。
集合 $A \times B$ 中元素的形式为(a,b)，其中 $a\in A, b \in B$ 。

3）逻辑析取： $A \vee B$

一个析取是构造地真，当且仅当我们能够证明析取式之一。所以， $A \vee B$ 的证明包括：A或者B的一个证明，加上有关到底是A还是B的证明的信息。
$A\vee B$ 等价于 A + B，是A与B的不想交并。
集合A+B中的元素的形式为inl(a)或inr(b)，其中 $a\in A, b \in B$ .

4）逻辑非： $\not A$

命题A的反可以定义为： $\not A \equiv A \supset \perp$ 。其中 $\perp$ 代表荒谬（absurdity）。即一个没有证明的命题。如果用 $\Phi$ 代表空集，则利用逻辑蕴含的解释，有
$\not A$ 等价于 $A \to \Phi$ 。

为了对用两次定义的命题进行解释，我们来定义在集合族上的操作，即集合B依赖于集合A中的元素x。用 $B[x\leftarrow a]$ 表示把B中所有自由出现的x都用a替换后得到的表达式。

5）存在量词： $(\exists x \in A)B$

$(\exists x \in A)B$ 的证明包括：集合A的一个元素的构造，以及 $B[x\leftarrow a]$ 的一个证明。因此， $(\exists x \in A)B$ 的证明是一有序对，其第一分量是集合A的一个元素，第二分量是 $B[x\leftarrow a]$ 的一个证明。
$(\exists x \in A)B$ 等价于 $(\Sigma x\in A)B$ ， $(\Sigma x\in A)B$ 是一集合族的不相交并（disjoint union）。
集合中 $(\Sigma x\in A)B$ 中元素的形式为序偶 <a,b> ，其中 $a\in A, b \in B[x \leftarrow a]$。

6）全称量词： $(\forall x\in A)B$

$(\forall x \in A)B$ 的证明是一个函数（方法，程序），对于集合A中的每一个元素a给出 $B[x\leftarrow a]$ 的一个证明。因此，
$(\forall x \in A)B$ 等价于 $(\prod x\in A)B$ ， $(\prod x\in A)B$ 是一集合族的笛卡尔积。
集合中 $(\prod x\in A)B$ 中元素都是函数，当应用与集合A中元素a时，给出集合 $B[x\leftarrow a]$ 中的一个元素。该集合中的元素形式为 $\lambda x.b$ ，其中 $b\in B$ ，并且b和B都可能依赖于 $x\in A$ 。

类型的概念

类型轮中最基本的概念是类型的概念。对类型的直觉解释需要两方面的内容：

说明类型的对象是什么
说明该类型的两个对象相等的意义

类型论中有四种断言形式，对他们的直觉解释如下：

A type，A是一个类型。
A=B，A和B是相等的类型。
$a\in A$ ，a是类型A中的一个对象。
$a=b\in A$ ，a和b是类型A中的相等对象

假言断言

前边的四种断言不依赖于任何假设，假言断言（hypothetical judgement）通常都有一个上下文（context）.

下面我们只对当上下文的长度为1分别对前面4中断言的假言形式进行解释，其他可以通过归纳解释得到：上下文长度为0时，就是上一节的情况。以下使C是任意不依赖于任何假设的类型。

A type $[x\in C]$ ，当 $[x\in C]$ 时，A是一类型
A = B $[x\in C]$ ，A和B是类型C上相等的类型族
$a\in A[x\in C]$ ，a是依赖于 $[x\in C]$ 的类型A中的一个对象
$a=b\in A[x\in C]$ ，a和b是依赖于 $[x\in C]$ 的类型A中的相等对象

类型组成

产生基本类型的方式
- 类型Set
- 如果 $A\in Set$ ，A中元素的类型： $\frac{A\in Set}{El(A)type}$
引入其他类型的方式
- 函数类型： $\frac{AtypeBtype[x\in A]}{(x\in A)Btype}$ ，相等函数类型 $\frac{A=A’B=B’[x\in A]}{(x\in A)B=(x\in A’)B’}$
  - 把函数应用于对象： $\frac{c\in (x\in A)Ba\in A}{c(a)\in B[x\leftarrow a]}$ ， $\frac{c\in (x\in A)Ba=b\in A}{c(a)=c(b)\in B[x\leftarrow a]}$
  - 引入函数的基本方法是对表达式共的一个变量进行抽象。： $\frac{b\in B[x\in A]}{[x]b\in (x\in A)B}$
- 归纳定义的集合

常数的引入

新的常数：

定义常数（defined constant）：用其他对象来定义的
- 显示定义常数：给出明确定义，实际上是某个类型中对象的简写
- 隐式定义常数：用于说明当把它应用于它的参数后，得到什么定义者
原始常数（primitive constant），值就是常数本身，只有一个类型，没有定义，也成为构造子（constructor），如自然数集合的定义就是通过声明以下常数：
- $N\in Set$
- $succ\in (N)N$
- $0\in N$

如何构建高绩效团队

Posted on 2019-08-08 Edited on 2019-11-04 Valine:

剔除害群之马

如果一个人没有良好的品质，任何领导者没有魔力能够把品质注入到他身上。

作为团队的领导，可以对每一个成员进行培养、测试他的品质，给年轻人机会让他展示自己的品质，但如果这种品质在他身上根本就不存在，就不能注入。这一点，每一个团队领导都要十分清楚。所以说，选择可靠的人的能力非常重要，有匹害群之马，大多数的组织都会功亏一篑。

团队的发展过程中，要坚持四个尊重

尊重别人
尊重诚实的品质
尊重忠诚的品质
尊重时间

高绩效团队拥有哪些元素

清晰的目的和愿景：人们想知道他们在做什么，想要达到什么样的目标，想要一些他们可以承诺的事情。这是很多团队领导者忽略的东西，或者他们知道自己的愿景是什么，但是，他们忘记了与团队成员分享，他们忽略了经常重复的好处。
清晰的目标：这包括两部分，为整个团队和每个团队成员制定明确的目标，这样，他们就知道团队对他们的期望，以及他们将如何为整体绩效做贡献。
高的工作标准：高效的团队为自己的绩效、工作方式和他们工作的水平感到自豪。拥有清晰的绩效标准或关键绩效指标（KPI）会得到强有力的承诺—-如果团队成员参与了这些标准的设定，承诺就会更多。
系统和程序：建立清晰的工作、报告和执行过程的清晰方法可以提高工作效率和效果。当团队成员变得更习惯于在一起工作时，这些系统和程序会变得更高质量。
清晰、开放的沟通：包括正式的非正式的沟通。在一个团队里，信息分享、说出你想要的和问出你想要的是非常重要的，更为关键的是团队成员之间可以相互倾听对方的观点并尊重对方的贡献，这样做的部分原因是，你可以轻松地表达不同意见、处理分歧。
信任和承诺：这是一种无形的元素，虽然人际关系很好，但这在团队中并不重要，更重要的是能够尊重同事，与他们一起工作，感觉他们会履行自己的承诺，可靠和值得信任远比彼此喜欢更为重要。
领导力：随着团队的发展，领导风格和方法需要与时俱进。
定义的角色和职责：大多数人都想知道他们应该做什么，以及他们将如何被评估。
归属感：团队成员有归属感吗？把目标感与愿景结合起来。

重点是分享和互相依赖

高效的团队合作不是一个人们抛弃“自我”，仅是依附于团队获得支持和认同的依赖过程，在这个过程中，它也不是一个“我”是第一位的独立过程，高绩效团队是一个相互依赖的过程。

在高绩效协团队中，我们不会看到“指责”，或声称“这不是我的工作”。相互依赖的思维意味着从“这对我有什么好处？”到“这对我们有什么好处？”

[转]阿里业务中台架构详解

Posted on 2019-08-08 Edited on 2019-12-29 Valine:

reference:
https://mp.weixin.qq.com/s/eLI4vcBwi0Q96yRlzvN3BQ

阿里业务中台架构图

基础设施服务，即IAAS层，提供硬件底层支持。

基础服务层，即PAAS层，包括分布式服务框架、分布式数据库、分布式消息、分布式存储、分布式事务、实时监控服务等等。

互联网业务中台，包括各服务中心的抽象出来的各种业务能力，包括交易中心、支付中心、营销中心、结算中心、用户中心、账户中心等等。也包括非业务类服务，如日志分析中心、配置中心、序列中心、基础中心。

业务应用，经过调取业务中台，组装形成独立业务服务能力的业务应用，如网银、手机银行。

交易来源，就是前台用户使用的各个端，如淘宝App、PC站等。

通过阿里云平台将技术中台进行部署，对集团内共享业务单元提供支撑，并最终对前台各业务线提供服务化能力输出。

产品形态

从这张产品形态图上边我们可以看出来，阿里巴巴的开发者主要开发的是能力，然后构建内部的能力地图，这里的能力可以看作是一个原子服务。在接收新的需求后，将需求进行结构化，通过已有能力配置产生新的业务，组成业务列表和业务全景。最后给这个业务一个身份标识，进行业务度量。

全剧架构

业务开发生命周期、业务创新和智能化

能力地图下放到需求域，商业能力可沉淀。根据已有数据进行分析，持续进化。

数据中台架构

数据中台本质上是在原有的计算存储平台与应用服务之间增加了一层统一数据服务中间件（OneService）。

形成了统一全域数据体系，实现了计算存储累计过亿的成本降低、响应业务效率多倍提升、为业务快速创新提供坚实保障。

全域数据采集与引入：以需求为驱动，以数据多样性的全域思想为指导，采集与引入全业务、多终端、多形态的数据；

标准规范数据架构与研发：统一基础层、公共中间层、百花齐放应用层的数据分层架构模式，通过数据指标结构化规范化的方式实现指标口径统一；

连接与深度萃取数据价值：形成以业务核心对象为中心的连接和标签体系，深度萃取数据价值；

统一数据资产管理：构建元数据中心，通过资产分析、应用、优化、运营四方面对看清数据资产、降低数据管理成本、追踪数据价值。

统一主题式服务：通过构建服务元数据中心和数据服务查询引擎，面向业务统一数据出口与数据查询逻辑，屏蔽多数据源与多物理表；

极大的丰富和完善了阿里巴巴大数据中心，OneData、OneID、OneService渐趋成熟并成为上至CEO、下至一线员工共识的方法论体系。

阿里技术全栈全景图

阿里技术全栈包含：移动中台、业务中台、数据中台、基本中间件、基础设施、前台业务、后台业务。

移动中台，包括移动网关、开发套件&框架、消息推送、移动IM等等，提供了限流、负载、鉴权、消息推送、开发框架等等，使得移动端应用开发效率更高。

业务中台和数据中台，将业务、数据抽象和沉淀形成服务能力，对前台提供调用。

阿里技术平台底座

几百个业务应用，共享一个技术平台底座。

大中台、小前台

阿里巴巴集团在近期的组织结构调整中，组成由“小前台，大中台”互为协同的创新管理模式。
原阿里巴巴中国零售事业群总裁张建锋将担负起“中台”的重要工作，负责共享、数据、搜索，以及闲鱼、淘宝头条等创新孵化业务。